Machine learning and statistical models to predict all-cause mortality in type 2 diabetes: Results from the UK Biobank study

Tingjing Zhang,Mingyu Huang,Liangkai Chen,Yang Xia,Weiqing Min,Shuqiang Jiang
DOI: https://doi.org/10.1016/j.dsx.2024.103135
Abstract:Aims: This study aims to compare the performance of contemporary machine learning models with statistical models in predicting all-cause mortality in patients with type 2 diabetes mellitus and to develop a user-friendly mortality risk prediction tool. Methods: A prospective cohort study was conducted including 22,579 people with diabetes from the UK Biobank. Models evaluated include Cox proportional hazards, random survival forests (RSF), gradient boosting (GB) survival, DeepSurv, and DeepHit. Results: Over a median follow-up period of 9 years, 2,665 patients died. Machine learning models outperformed the Cox model in the validation dataset, with C-index values of 0.72-0.73 vs. 0.71 for Cox (p < 0.01). Deep learning models, particularly DeepHit, demonstrated superior calibration and achieved lower Brier scores (0.09 vs. 0.10 for Cox, p < 0.05). An online prediction tool based on the DeepHit was developed for patient care: http://123.57.42.89:6006/. Conclusions: Machine learning models performed better than statistical models, highlighting the potential of machine learning techniques for predicting all-cause mortality risk and facilitating personalized healthcare management for individuals with diabetes.
What problem does this paper attempt to address?