
Preprints with The Lancet is a collaboration between The Lancet Group of journals and SSRN to facilitate the open sharing of preprints for early engagement, community comment, and collaboration. Preprints available here are not Lancet publications or necessarily under review with a Lancet journal. These preprints are early-stage research papers that have not been peer-reviewed. The usual SSRN checks and a Lancet-specific check for appropriateness and transparency have been applied. The findings should not be used for clinical or public health decision-making or presented without highlighting these facts. For more information, please see the FAQs.
A Predictive Model for Continuous Myopia Progression Using Stacking Algorithm and Shapley Additive Explanations in Southern China: A Retrospective Cohort Study
18 Pages Posted: 25 Jun 2024
More...Abstract
Background: The prevalence of myopia has rapidly increased in recent decades, which has become a global public health problem.We aimed to explore the risk factors and establish a machine learning (ML) model for myopia progression in children and teenagers through true-world longitudinal data in southern China.
Methods: This retrospective cohort study screened 13,504 children and teenagers aged 6-18 at baseline annually over three years. Data preprocessing was performed using the Synthetic Minority Over-sampling Technique (SMOTE). A continuous myopia progression prediction model was constructed by selecting risk factors through stepwise logistic regression and employing a stacking ensemble algorithm for ML integration. The results were visualized using Shapley Additive Explanations (SHAP), and an online app was developed using the Streamlit tool.
Findings: Among the 45 ML model combinations, the Stochastic Gradient Descent + Random Forest model demonstrated the highest discriminative ability, with an average accuracy of 0.902 and an AUROC of 0.951. Further visualizations using SHAP revealed that the contribution of risk factors was ranked in the following order: age, non-cycloplegic spherical equivalent, uncorrected distance visual acuity, school type, and gender. These findings were then translated into an online tool designed for clinical application.
Interpretation: This study established an explainable ML model with high accuracy in predicting continuous myopia progression. Additionally, the model visualization and the development of an online app tool facilitate more convenient clinical monitoring of myopia progression.
Trial Registration: Registered with the Chinese Clinical Trial Registry (ChiCTR2200057391).
Funding: "the National Key R&D Project of China [grant number 2020YFA0112701, Yehong Zhuo];
the National Natural Science Foundation of China [grant number 62061136001, Yizhou Wang];
the Science and Technology Program of Guangzhou, China [grant number 202206080005, Yehong Zhuo]; the Natural Science Foundation of Guangdong Province [grant number 2024A1515013058, Yingting Zhu]."
Declaration of Interest: All authors declare no potential conflicts of interest.
Ethical Approval: This longitudinal cohort study was approved by the Ethics Committee of Sun Yat-sen University, Guangzhou, China (No.2021KYPJ185). Written informed consent was not required as no identifying information was recorded.
Keywords: Continuous progression of myopia, Children and adolescent, machine learning, Predictive model, SHAP
Suggested Citation: Suggested Citation