Abstract:Hui Zhang, 1, &ast Dandan Chen, 1, &ast Jing Shao, 1 Ping Zou, 2 Nianqi Cui, 3 Leiwen Tang, 1 Xiyi Wang, 4 Dan Wang, 1 Jingjie Wu, 1 Zhihong Ye 1 1 Department of Nursing, Zhejiang University School of Medicine Sir Run Run Shaw Hospital, Hangzhou, Zhejiang, People's Republic of China; 2 Department of Scholar Practitioner Program, School of Nursing, Nipissing University, Toronto, Ontario, Canada; 3 Department of Nursing, The Second Affiliated Hospital Zhejiang University School of Medicine, Hangzhou, Zhejiang, People's Republic of China; 4 Department of Nursing, School of Nursing, Shanghai JiaoTong University, Shanghai, People's Republic of China &astThese authors contributed equally to this work Correspondence: Zhihong Ye Department of Nursing, Zhejiang University School of Medicine Sir Run Run Shaw Hospital, 3# Qingchun Dong Road, Jianggan District, Hangzhou, Zhejiang, People's Republic of China Tel +86 13606612119 Email Purpose: Machine learning (ML) techniques have emerged as a promising tool to predict risk and make decisions in different medical domains. We aimed to compare the predictive performance of machine learning-based methods for 4-year risk of metabolic syndrome in adults with the previous model using logistic regression. Patients and Methods: This was a retrospective cohort study that employed a temporal validation strategy. Three popular ML techniques were selected to build the prognostic models. These techniques were artificial neural networks, classification and regression tree, and support vector machine. The logistic regression algorithm and ML techniques used the same five predictors. Discrimination, calibration, Brier score, and decision curve analysis were compared for model performance. Results: Discrimination was above 0.7 for all models except classification and regression tree model in internal validation, while the logistic regression model showed the highest discrimination in external validation (0.782) and the smallest discrimination differences. The logistic regression model had the best calibration performance, and ANN also showed satisfactory calibration in internal validation and external validation. For overall performance, logistic regression had the smallest Brier score differences in internal validation and external validation, and it also had the largest net benefit in external validation. Conclusion: Overall, this study indicated that the logistic regression model performed as well as the flexible ML-based prediction models at internal validation, while the logistic regression model had the best performance at external validation. For clinical use, when the performance of the logistic regression model is similar to ML-based prediction models, the simplest and more interpretable model should be chosen. Keywords: prognosis model, metabolic syndrome, calibration, discrimination, machine learning Metabolic Syndrome (MetS) refers to a group of risk factors including hypertension, hyperglycemia, dyslipidemia, hypertension, and abdominal obesity. 1 It is well known that metabolic risk factors can increase the likelihood of developing heart disease and diabetes mellitus. Research has suggested that MetS predicts a 5-fold increase in the risk of type 2 diabetes mellitus, a 1.5-fold increase in all-cause mortality, and a two-fold increase in the risk of cardiovascular disease. 2–4 Moreover, evidence has shown that MetS is related to the occurrence of cancers and chronic kidney disease. 5,6 All these influences are associated with increased healthcare costs. Consequently, it is crucial to develop a prediction model to identify individuals who are at a high risk of MetS early and provide the appropriate treatment strategy. A prediction model can estimate the individualized absolute risk probability of a particular outcome. Prediction models can be classified into two categories: (1) diagnostic models, which are developed to identify whether a disease is present; (2) prognostic models, which are developed to detect whether an outcome will occur in the future. 7 A prediction model can motivate both physicians and patients in their clinical risk-management decisions, guide patient management, and inform health initiatives. 7 Clinical practice would therefore benefit from accurate individual estimates of MetS through the use of prediction models. A systematic review was performed previously by our team to assess the risk of bias of the prognostic prediction models for MetS. 8 We found that existing prognostic prediction models for metabolic syndrome -Abstract Truncated-

A comprehensive multi-task deep learning approach for predicting metabolic syndrome with genetic, nutritional, and clinical data

Prediction of metabolic syndrome using machine learning approaches based on genetic and nutritional factors: a 14-year prospective-based cohort study

Machine learning-based predictive model for prevention of metabolic syndrome

Development of a Metabolic Syndrome Classification and Prediction Model for Koreans Using Deep Learning Technology: The Korea National Health and Nutrition Examination Survey (KNHANES) (2013–2018)

An Augmented Model with Inferred Blood Features for the Self-diagnosis of Metabolic Syndrome.

Machine learning-aided risk prediction for metabolic syndrome based on 3 years study

Machine Learning Approach for Metabolic Syndrome Diagnosis Using Explainable Data-Augmentation-Based Classification

Employing broad learning and non-invasive risk factor to improve the early diagnosis of metabolic syndrome

Machine Learning-Based Prediction for 4-Year Risk of Metabolic Syndrome in Adults: A Retrospective Cohort Study

Predictive analysis of metabolic syndrome based on 5-years continuous physical examination data

Building a model for predicting metabolic syndrome using artificial intelligence based on an investigation of whole-genome sequencing

Machine Learning Identification of Nutrient Intake Variations across Age Groups in Metabolic Syndrome and Healthy Populations

Risk prediction model of metabolic syndrome in perimenopausal women based on machine learning

Multivariate genomic analysis of 5 million people elucidates the genetic architecture of shared components of the metabolic syndrome

Integrative machine learning approaches for predicting disease risk using multi-omics data from the UK Biobank

Metabolic syndrome predictive modelling in Bangladesh applying machine learning approach

Analyzing Longitudinal Health Screening Data with Feature Ensemble and Machine Learning Techniques: Investigating Diagnostic Risk Factors of Metabolic Syndrome for Chronic Kidney Disease Stages 3a to 3b

Using Machine Learning to Predict Obesity Based on Genome-Wide and Epigenome-Wide Gene–Gene and Gene–Diet Interactions

Prediction of Myocardial Infarction Using a Combined Generative Adversarial Network Model and Feature-Enhanced Loss Function

A Machine Learning Approach for Prediction of Diabetes Mellitus

Multipartite network analysis to identify environmental and genetic associations of metabolic syndrome in the Korean population