Accurate prediction of myopic progression and high myopia by machine learning

Jiahui Li,Simiao Zeng,Zhihuan Li,Jie Xu,Zhuo Sun,Jing Zhao,Meiyan Li,Zixing Zou,Taihua Guan,Jin Zeng,Zhuang Liu,Wenchao Xiao,Ran Wei,Hanpei Miao,Ian Ziyar,Junxiong Huang,Yuanxu Gao,Yangfa Zeng,Xing-Tao Zhou,Kang Zhang
DOI: https://doi.org/10.1093/pcmedi/pbae005
2024-01-08
Precision Clinical Medicine
Abstract:Abstract Background Myopia is a leading cause of visual impairment in Asia and worldwide. However, accurately predicting the progression of myopia and the high risk of myopia remains a challenge. This study aims to develop a predictive model for the development of myopia. Methods We first retrospectively gathered 612 530 medical records from five independent cohorts, encompassing 227 543 patients ranging from infants to young adults. Subsequently, we developed a multivariate linear regression algorithm model to predict the progression of myopia and the risk of high myopia. Result The model to predict the progression of myopia achieved an R2 value of 0.964 vs a mean absolute error (MAE) of 0.119D [95% confidence interval (CI): 0.119, 1.146] in the internal validation set. It demonstrated strong generalizability, maintaining consistent performance across external validation sets: R2 = 0.950 vs MAE = 0.119D (95% CI: 0.119, 1.136) in validation study 1, R2 = 0.950 vs MAE = 0.121D (95% CI: 0.121, 1.144) in validation study 2, and R2 = 0.806 vs MAE = −0.066D (95% CI: −0.066, 0.569) in the Shanghai Children Myopia Study. In the Beijing Children Eye Study, the model achieved an R2 of 0.749 vs a MAE of 0.178D (95% CI: 0.178, 1.557). The model to predict the risk of high myopia achieved an area under the curve (AUC) of 0.99 in the internal validation set and consistently high area under the curve values of 0.99, 0.99, 0.96 and 0.99 in the respective external validation sets. Conclusion Our study demonstrates accurate prediction of myopia progression and risk of high myopia providing valuable insights for tailoring strategies to personalize and optimize the clinical management of myopia in children.
medicine, research & experimental
What problem does this paper attempt to address?
The paper aims to develop a model to predict myopia progression and the risk of high myopia. Specifically, the research team constructed a multivariable linear regression algorithm model using machine learning methods based on data from five independent cohorts in China to predict the speed of myopia progression in children and adolescents and the risk of developing high myopia (defined as spherical equivalent ≤ -6.00 D). The study collected datasets from five different regions, including medical records from Guangzhou, Zhuhai, Shanghai, and Beijing, covering patients of various age groups from infants to young adults. These data were used to train the model and evaluate its performance on an internal validation set, followed by testing on four external independent validation sets to confirm the model's generalizability. The results showed that the model could accurately predict myopia progression, with an R² value of 0.964 and a mean absolute error (MAE) of 0.119 D on the internal validation set. In terms of predicting the risk of high myopia, the model achieved an area under the curve (AUC) of 0.99 on the internal validation set. Additionally, the model demonstrated good consistency and accuracy in external analyses. The study also analyzed factors influencing high myopia progression, finding that an annual myopia progression rate greater than 1.00 D and an early onset age (between 3 to 7 years) were associated with a higher likelihood of progressing to high myopia. In summary, the machine learning model proposed in this study can provide valuable insights for personalized management and optimization of clinical treatment strategies for childhood myopia.