A comprehensive multi-task deep learning approach for predicting metabolic syndrome with genetic, nutritional, and clinical data
Minhyuk Lee,Taesung Park,Ji-Yeon Shin,Mira Park
DOI: https://doi.org/10.1038/s41598-024-68541-1
2024-08-01
Abstract:Metabolic syndrome (MetS) is a complex disorder characterized by a cluster of metabolic abnormalities, including abdominal obesity, hypertension, elevated triglycerides, reduced high-density lipoprotein cholesterol, and impaired glucose tolerance. It poses a significant public health concern, as individuals with MetS are at an increased risk of developing cardiovascular diseases and type 2 diabetes. Early and accurate identification of individuals at risk for MetS is essential. Various machine learning approaches have been employed to predict MetS, such as logistic regression, support vector machines, and several boosting techniques. However, these methods use MetS as a binary status and do not consider that MetS comprises five components. Therefore, a method that focuses on these characteristics of MetS is needed. In this study, we propose a multi-task deep learning model designed to predict MetS and its five components simultaneously. The benefit of multi-task learning is that it can manage multiple tasks with a single model, and learning related tasks may enhance the model's predictive performance. To assess the efficacy of our proposed method, we compared its performance with that of several single-task approaches, including logistic regression, support vector machine, CatBoost, LightGBM, XGBoost and one-dimensional convolutional neural network. For the construction of our multi-task deep learning model, we utilized data from the Korean Association Resource (KARE) project, which includes 352,228 single nucleotide polymorphisms (SNPs) from 7729 individuals. We also considered lifestyle, dietary, and socio-economic factors that affect chronic diseases, in addition to genomic data. By evaluating metrics such as accuracy, precision, F1-score, and the area under the receiver operating characteristic curve, we demonstrate that our multi-task learning model surpasses traditional single-task machine learning models in predicting MetS.
What problem does this paper attempt to address?