Machine Learning-Based Prediction for 4-Year Risk of Metabolic Syndrome in Adults: A Retrospective Cohort Study
Hui Zhang,Dandan Chen,Jing Shao,Ping Zou,Nianqi Cui,Leiwen Tang,Xiyi Wang,Dan Wang,Jingjie Wu,Zhihong Ye
DOI: https://doi.org/10.2147/RMHP.S328180
2021-10-20
Risk Management and Healthcare Policy
Abstract:Hui Zhang, 1, &ast Dandan Chen, 1, &ast Jing Shao, 1 Ping Zou, 2 Nianqi Cui, 3 Leiwen Tang, 1 Xiyi Wang, 4 Dan Wang, 1 Jingjie Wu, 1 Zhihong Ye 1 1 Department of Nursing, Zhejiang University School of Medicine Sir Run Run Shaw Hospital, Hangzhou, Zhejiang, People's Republic of China; 2 Department of Scholar Practitioner Program, School of Nursing, Nipissing University, Toronto, Ontario, Canada; 3 Department of Nursing, The Second Affiliated Hospital Zhejiang University School of Medicine, Hangzhou, Zhejiang, People's Republic of China; 4 Department of Nursing, School of Nursing, Shanghai JiaoTong University, Shanghai, People's Republic of China &astThese authors contributed equally to this work Correspondence: Zhihong Ye Department of Nursing, Zhejiang University School of Medicine Sir Run Run Shaw Hospital, 3# Qingchun Dong Road, Jianggan District, Hangzhou, Zhejiang, People's Republic of China Tel +86 13606612119 Email Purpose: Machine learning (ML) techniques have emerged as a promising tool to predict risk and make decisions in different medical domains. We aimed to compare the predictive performance of machine learning-based methods for 4-year risk of metabolic syndrome in adults with the previous model using logistic regression. Patients and Methods: This was a retrospective cohort study that employed a temporal validation strategy. Three popular ML techniques were selected to build the prognostic models. These techniques were artificial neural networks, classification and regression tree, and support vector machine. The logistic regression algorithm and ML techniques used the same five predictors. Discrimination, calibration, Brier score, and decision curve analysis were compared for model performance. Results: Discrimination was above 0.7 for all models except classification and regression tree model in internal validation, while the logistic regression model showed the highest discrimination in external validation (0.782) and the smallest discrimination differences. The logistic regression model had the best calibration performance, and ANN also showed satisfactory calibration in internal validation and external validation. For overall performance, logistic regression had the smallest Brier score differences in internal validation and external validation, and it also had the largest net benefit in external validation. Conclusion: Overall, this study indicated that the logistic regression model performed as well as the flexible ML-based prediction models at internal validation, while the logistic regression model had the best performance at external validation. For clinical use, when the performance of the logistic regression model is similar to ML-based prediction models, the simplest and more interpretable model should be chosen. Keywords: prognosis model, metabolic syndrome, calibration, discrimination, machine learning Metabolic Syndrome (MetS) refers to a group of risk factors including hypertension, hyperglycemia, dyslipidemia, hypertension, and abdominal obesity. 1 It is well known that metabolic risk factors can increase the likelihood of developing heart disease and diabetes mellitus. Research has suggested that MetS predicts a 5-fold increase in the risk of type 2 diabetes mellitus, a 1.5-fold increase in all-cause mortality, and a two-fold increase in the risk of cardiovascular disease. 2–4 Moreover, evidence has shown that MetS is related to the occurrence of cancers and chronic kidney disease. 5,6 All these influences are associated with increased healthcare costs. Consequently, it is crucial to develop a prediction model to identify individuals who are at a high risk of MetS early and provide the appropriate treatment strategy. A prediction model can estimate the individualized absolute risk probability of a particular outcome. Prediction models can be classified into two categories: (1) diagnostic models, which are developed to identify whether a disease is present; (2) prognostic models, which are developed to detect whether an outcome will occur in the future. 7 A prediction model can motivate both physicians and patients in their clinical risk-management decisions, guide patient management, and inform health initiatives. 7 Clinical practice would therefore benefit from accurate individual estimates of MetS through the use of prediction models. A systematic review was performed previously by our team to assess the risk of bias of the prognostic prediction models for MetS. 8 We found that existing prognostic prediction models for metabolic syndrome -Abstract Truncated-
health care sciences & services,health policy & services