Research on Diabetes Risk Prediction Model at Early Stage Based on Machine Learning
Yangyang Cui,Hankun Zhang,Song Wang,Zhenhua Liao,Weiqiang Liu
DOI: https://doi.org/10.1007/978-3-030-81007-8_69
2021-01-01
Abstract:Diabetes mellitus (DM) as a common chronic disease, is one of the most serious and critical health problems in the world in the 21st century. Due to long-term asymptomatic, about 50% of DM patients are undiagnosed. With the development of machine learning (ML), early diagnosis of DM becomes possible. This study collected 575 sample instances, including 20 risk factors, including 10 symptom factors and 10 disease factors. ML was used to construct an early risk prediction model for DM. In this study, 16 classifiers were compared for the data. And the classification effect of XGboost is found to be the best. Precision, Recall and F-measure are 98.80%, 99% and 98.90% respectively. The classifier is used to analyze the risk factors that are highly correlated with DM. For the early prediction of DM, the top ten risk factors of attribute importance, from high to low are: polydipsia, polyuria, age, pregnancies, DM history, weight loss, polyphagia, obesity, living habits, visual blurring. This study can not only complete the classification and detection of DM, but also plays an important role in control and prevention of DM. Besides, this study shows that ML has great potential in the clinical application of non-invasive assessment of early risk of DM.