Risk Prediction of Diabetes: Big data mining with fusion of multifarious physical examination indicators
Hui Yang,Yamei Luo,Xiaolei Ren,Ming Wu,Xiaolin He,Bowen Peng,Kejun Deng,Dan Yan,Hua Tang,Hao Lin
DOI: https://doi.org/10.1016/j.inffus.2021.02.015
IF: 18.6
2021-11-01
Information Fusion
Abstract:<p>Diabetes is a global epidemic. Long-term exposure to hyperglycemia can cause chronic damage to various tissues. Thus, early diagnosis of diabetes is crucial. In this study, we designed a computational system to predict diabetes risk by fusing multifarious types of physical examination data. We collected 1,507,563 physical examination data of healthy people and diabetes patients, as well as 387,076 physical examination data from the follow-up records from 2011 to 2017 of diabetes patients in Luzhou City in China. Three types of physical examination indexes were statistically analyzed: demographics, vital signs, and laboratory values. To distinguish diabetes patients from healthy people, a model based on eXtreme Gradient Boosting (XGBoost) was developed, which could produce an area under the receiver operating characteristic curve (AUC) of 0.8768. Moreover, to improve the convenience and flexibility of the model in clinical and real-life scenarios, a diabetes risk scorecard was established based on logistic regression, which could evaluate human health. Lastly, we statistically analyzed the data from the follow-up records to identify the key factors influencing patient control of their conditions. To improve the diabetes cascade screening and personal lifestyle management, an online diabetes risk assessment system was established, which can be freely accessed at <a href="http://lin-group.cn/server/DRSC/index.html">http://lin-group.cn/server/DRSC/index.html</a>. This system is expected to provide guidance for human health management.</p>
computer science, artificial intelligence, theory & methods