Risk prediction model of metabolic syndrome in perimenopausal women based on machine learning

Wang Xiaoxue,Wang Zijun,Chen Shichen,Yang Mukun,Chen Yi,Miao Linqing,Bai Wenpei
DOI: https://doi.org/10.1016/j.ijmedinf.2024.105480
Abstract:Introduction: Metabolic syndrome (MetS) is considered to be an important parameter of cardio-metabolic health and contributing to the development of atherosclerosis, type 2 diabetes. The incidence of MetS significantly increases in postmenopausal women, therefore, the perimenopausal period is considered a critical phase for prevention. We aimed to use four machine learning methods to predict whether perimenopausal women will develop MetS within 2 years. Methods: Women aged 45-55 years who underwent 2 consecutive years of physical examinations in Ninth Clinical College of Peking University between January 2021 and December 2022 were included. We extracted 26 features from physical examinations, and used backward selection method to select top 10 features with the largest area under the receiver operating characteristic curve (AUC). Extreme gradient boosting (XGBoost), Random forest (RF), Multilayer perceptron (MLP) and Logistic regression (LR) were used to establish the model. Those performance were measured by AUC, accuracy, precision, recall and F1 score. SHapley Additive exPlanation (SHAP) value was used to identify risk factors affecting perimenopausal MetS. Results: A total of 8700 women had physical examination records, and 2,254 women finally met the inclusion criteria. For predicting MetS events, RF and XGBoost had the highest AUC (0.96, 0.95, respectively). XGBoost has the highest F1 value (F1 = 0.77), followed by RF, LR and MLP. SHAP value suggested that the top 5 variables affecting MetS in this study were Waist circumference, Fasting blood glucose, High-density lipoprotein cholesterol, Triglycerides and Diastolic blood pressure, respectively. Conclusion: We've developed a targeted MetS risk prediction model for perimenopausal women, using health examination data. This model enables early identification of high MetS risk in this group, offering significant benefits for individual health management and wider socio-economic health initiatives.
What problem does this paper attempt to address?