Risk Prediction of Diabetes Progression Using Big Data Mining with Multifarious Physical Examination Indicators
Xiaohong Chen,Shiqi Zhou,Lin Yang,Qianqian Zhong,Hongguang Liu,Yongjian Zhang,Hanyi Yu,Yongjiang Cai
DOI: https://doi.org/10.2147/dmso.s449955
2024-03-12
Diabetes Metabolic Syndrome and Obesity Targets and Therapy
Abstract:Xiaohong Chen, 1, &ast Shiqi Zhou, 2, &ast Lin Yang, 1 Qianqian Zhong, 1 Hongguang Liu, 3 Yongjian Zhang, 1 Hanyi Yu, 2 Yongjiang Cai 1 1 Center of Health Management, Peking University Shenzhen Hospital, Shenzhen, People's Republic of China; 2 School of Future Technology, South China University of Technology, Guangzhou, People's Republic of China; 3 Center of Health Management, Huazhong University of Science and Technology Union Hospital (Nanshan Hospital), Shenzhen, People's Republic of China &astThese authors contributed equally to this work Correspondence: Yongjiang Cai, Center of Health Management, Peking University Shenzhen Hospital, Shenzhen, 518036, People's Republic of China, Email Hanyi Yu, School of Future Technology, South China University of Technology, Guangzhou, People's Republic of China, Email Purpose: The purpose of this study is to explore the independent-influencing factors from normal people to prediabetes and from prediabetes to diabetes and use different prediction models to build diabetes prediction models. Methods: The original data in this retrospective study are collected from the participants who took physical examinations in the Health Management Center of Peking University Shenzhen Hospital. Regression analysis is individually applied between the populations of normal and prediabetes, as well as the populations of prediabetes and diabetes, for feature selection. Afterward,the independent influencing factors mentioned above are used as predictive factors to construct a prediction model. Results: Selecting physical examination indicators for training different ML models through univariate and multivariate logistic regression, the study finds Age, PRO, TP, and ALT are four independent risk factors for normal people to develop prediabetes, and GLB and HDL.C are two independent protective factors, while logistic regression performs best on the testing set (Acc: 0.76, F-measure: 0.74, AUC: 0.78). We also find Age, Gender, BMI, SBP, U.GLU, PRO, ALT, and TG are independent risk factors for prediabetes people to diabetes, and AST is an independent protective factor, while logistic regression performs best on the testing set (Acc: 0.86, F-measure: 0.84, AUC: 0.74). Conclusion: The discussion of the clinical relationships between these indicators and diabetes supports the interpretability of our feature selection. Among four prediction models, the logistic regression model achieved the best performance on the testing set. Keywords: prediabetes, prediction model, physical examination, machine learning, regression analysis Diabetes is a metabolic disease characterized by hyperglycemia, which is caused by insufficient insulin secretion or reduced insulin sensitivity. Its main characteristics are that the blood sugar level is higher than the normal range for a long time. Long-term abnormal blood sugar level increases the risk of microvascular and macrovascular complications, thus damaging multiple organs and tissues, even leading to death. Since there is no effective cure for diabetes at present, patients need lifelong treatment, which brings a heavy economic burden to patients and their families. 1 Prediabetes, also known as impaired glucose regulation (IGR), is a pathological state in which the level of human blood sugar is higher than normal but has not yet reached the diagnostic criteria for diabetes. 2 According to the definition of the World Health Organization (WTO), prediabetes can be divided into two types: impaired fasting glucose (IFG) and impaired glucose tolerance (IGT). Research shows that prediabetes have a significant positive correlation between the risk and mortality of obstructive sleep apnea, coronary heart disease, stroke, and complex cardiovascular disease. 3,4 Besides, prediabetes is a high-risk state of diabetes. About 5%−10% of prediabetes patients develop into diabetes patients every year. At the same time, some studies have shown that after certain medical intervention treatments for prediabetes patients, a certain percentage of patients can recover their blood sugar level to normal level under medical intervention. 5 This finding is of great significance in reducing the incidence rate of diabetes, improving the national health level, and reducing the burden on the medical and health system. However, the clinical symptoms of prediabetes are not obvious, and patients often miss the best opportunity for intervention. According to the WHO 1999 standard, 1 the gold standard for diabetes is a fasting blood sugar level of ≥7.0 mmol/L or a blood sugar level two hours after oral glucose tol -Abstract Truncated-
endocrinology & metabolism