Identification of Novel Biomarkers for Metabolic Syndrome Based on Machine Learning Algorithms and Integrated Bioinformatics Analysis

Guanzhi Liu,Chen Chen,Ning Kong,Yutian Lei,Sen Luo,Zhuo Huang,Kunzheng Wang,Pei Yang,Xin Huang
DOI: https://doi.org/10.21203/rs.3.rs-225591/v1
2021-01-01
Abstract:Abstract Background: Metabolic syndrome is a common and complicated metabolic disorder and defined as a clustering of metabolic risk factors such as insulin resistance or diabetes, obesity, hypertension, and hyperlipidemia. However, its early diagnosis is limited because the lack of definitive clinical diagnostic biomarkers. In present study, we aim to select several candidate gene as a blood-based clinically applicable transcriptomics signature for metabolic syndrome. Method: We collected so far the largest MetS-associated peripheral blood high-throughput transcriptomics data and put forward a novel feature selection strategy by combining weighted gene co-expression network analysis, protein-protein interaction network analysis, LASSO regression and random forest approaches. Then, based on selected hub gene signature, we performed logistic regression analysis and subsequently established a web nomogram calculator for metabolic syndrome risk to detect the diagnostic value of this hub gene signature. Finally, Receiver Operating Characteristic curve analysis, calibration curve analysis, Hosmer-Lemeshow good of fit test and decision curve analysis showed the classification and calibration performance as well as potential clinical benefit of this hub gene signature. Results: Through weighted gene co-expression network analysis, protein-protein interaction network analysis, we identified 2 gene modules and 51 hub genes associated with metabolic syndrome. Then, we subsequently performed further feature selection via LASSO regression and random forest method. Finally, a 9-hub-gene signature with high diagnostic value and a web nomogram calculator for metabolic syndrome risk (https://xjtulgz.shinyapps.io/DynNomapp/) were developed. This 9-hub-gene signature showed excellent classification and calibration performance (AUC= 0.968 in training set, AUC= 0.883 in internal validation set, AUC= 0.861 in external validation set) as well as ideal potential clinical benefit. Conclusions: The blood-based 9-hub-gene signature identified in present study and the web nomogram calculator for metabolic syndrome risk are possible to accurately achieve the noninvasive screening or diagnosis of MetS considering the excellent classification ability, calibration and potential clinical benefits.
What problem does this paper attempt to address?