Interpretable Machine Learning-Derived Nomogram Model for Early Detection of Diabetic Retinopathy in Type 2 Diabetes Mellitus: a Widely Targeted Metabolomics Study

Jushuang Li,Chengnan Guo,Tao Wang,Yixi Xu,Fang Peng,Shuzhen Zhao,Huihui Li,Dongzhen Jin,Zhezheng Xia,Mingzhu Che,Jingjing Zuo,Chao Zheng,Honglin Hu,Guangyun Mao
DOI: https://doi.org/10.1038/s41387-022-00216-0
2021-01-01
SSRN Electronic Journal
Abstract:Objective Early identification of diabetic retinopathy (DR) is key to prioritizing therapy and preventing permanent blindness. This study aims to propose a machine learning model for DR early diagnosis using metabolomics and clinical indicators. Methods From 2017 to 2018, 950 participants were enrolled from two affiliated hospitals of Wenzhou Medical University and Anhui Medical University. A total of 69 matched blocks including healthy volunteers, type 2 diabetes, and DR patients were obtained from a propensity score matching-based metabolomics study. UPLC-ESI-MS/MS system was utilized for serum metabolic fingerprint data. CART decision trees (DT) were used to identify the potential biomarkers. Finally, the nomogram model was developed using the multivariable conditional logistic regression models. The calibration curve, Hosmer-Lemeshow test, receiver operating characteristic curve, and decision curve analysis were applied to evaluate the performance of this predictive model. Results The mean age of enrolled subjects was 56.7 years with a standard deviation of 9.2, and 61.4% were males. Based on the DT model, 2-pyrrolidone completely separated healthy controls from diabetic patients, and thiamine triphosphate (ThTP) might be a principal metabolite for DR detection. The developed nomogram model (including diabetes duration, systolic blood pressure and ThTP) shows an excellent quality of classification, with AUCs (95% CI) of 0.99 (0.97-1.00) and 0.99 (0.95-1.00) in training and testing sets, respectively. Furthermore, the predictive model also has a reasonable degree of calibration. Conclusions The nomogram presents an accurate and favorable prediction for DR detection. Further research with larger study populations is needed to confirm our findings.
What problem does this paper attempt to address?