Machine Learning Algorithms for Predicting the Risk of Fracture in Patients with Diabetes in China

Sijia Chu,Aijun Jiang,Lyuzhou Chen,Xi Zhang,Xiurong Shen,Wan Zhou,Shandong Ye,Chao Chen,Shilu Zhang,Li Zhang,Yang Chen,Ya Miao,Wei Wang
DOI: https://doi.org/10.1016/j.heliyon.2023.e18186
IF: 3.776
2023-01-01
Heliyon
Abstract:Background: Patients with diabetes are more likely to be predisposed to fractures compared to those without diabetes. In clinical practice, predicting fracture risk in diabetics is still difficult because of the limited availability and accessibility of existing fracture prediction tools in the diabetic population. The purpose of this study was to develop and validate models using machine learning (ML) algorithms to achieve high predictive power for fracture in patients with diabetes in China. Methods: In this study, the clinical data of 775 hospitalized patients with diabetes was analyzed by using Decision Tree (DT), Gradient Boosting Decision Tree (GBDT), Logistic Regression (LR), Random Forest (RF), Support Vector Machine (SVM), eXtreme Gradient Boosting (XGBoost) and Probabilistic Classification Vector Machines (PCVM) algorithms to construct risk prediction models for fractures. Moreover, the risk factors for diabetes-related fracture were identified by the feature selection algorithms. Results: The ML algorithms extracted 17 most relevant factors from raw clinical data to maximize the accuracy of the prediction results, including bone mineral density, age, sex, weight, high-density lipoprotein cholesterol, height, duration of diabetes, total cholesterol, osteocalcin, N-terminal propeptide of type I, diastolic blood pressure, and body mass index. The 7 ML models including LR, SVM, RF, DT, GBDT, XGBoost, and PCVM had f1 scores of 0.75, 0.83, 0.84, 0.85, 0.87, 0.88, and 0.97, respectively. Conclusions: This study identified 17 most relevant risk factors for diabetes-related fracture using ML algorithms. And the PCVM model proved to perform best in predicting the fracture risk in the diabetic population. This work proposes a cheap, safe, and extensible ML algorithm for the precise assessment of risk factors for diabetes-related fracture.
What problem does this paper attempt to address?