Identification of key biomarkers for predicting atherosclerosis progression in polycystic ovary syndrome via bioinformatics analysis and machine learning

Wenjing Zhang,Yalin Wu,Yalin Yuan,Leigang Wang,Bing Yu,Xin Li,Zhong Yao,Bin Liang
DOI: https://doi.org/10.1016/j.compbiomed.2024.109239
Abstract:Objective: Polycystic ovary syndrome (PCOS) is one of the most significant cardiovascular risk factors, playing vital roles in various cardiovascular diseases such as atherosclerosis (AS). This study attempted to explore key biomarkers for predicting AS in patients with PCOS and to investigate the role of immune cell infiltration in this process. Methods: We downloaded the expression matrix of AS (GSE100927, GSE28829) and PCOS (GSE54248) from the Gene Expression Omnibus (GEO) database. Differential expression analysis and weighted gene co-expression network analysis (WGCNA) were used to identify PCOS-related genes in AS. Functional enrichment analysis was employed to reveal underlying mechanisms. Then, Protein-protein interaction (PPI) and three machine learning algorithms were used to screen the hub genes, including the Least Absolute Shrinkage and Selection Operator (LASSO), Support Vector Machine-Recursive Feature Elimination (SVM-RFE), and Random Forest (RF). Moreover, the receiver operating characteristic (ROC) curve, calibration curve, and decision curve analysis (DCA) were applied to evaluate the diagnostic value of the nomogram model. Finally, we performed immune cell infiltration and single-gene GSEA. Results: A total of 41 genes were identified as PCOS-related genes in AS, with functional analysis indicating that the potential pathogenesis lies in inflammatory and immune responses. Furthermore, we identified two hub genes (MMP9 and P2RY13) by three machine learning algorithms. The nomogram model based on MMP9 and P2RY13 can be used as a new diagnostic model to differentiate AS in PCOS women (AUC>0.9). The calibration curves and DCA curves demonstrated the excellent discriminative ability and clinical practicality of this nomogram. Finally, immune infiltration analysis revealed the disorder of immunocytes in AS. The two gene expressions were negatively correlated with Monocyte and Macrophages M1, while positively correlated with Macrophages M0. Single gene GSEA analysis suggested that the MMP9 and P2RY13 might be involved in the metabolism and inflammation responses. Conclusion: We identified MMP9 and P2RY13 as the biomarkers and developed a new nomogram for early diagnosing AS based on them in PCOS patients. Our findings may provide new insights into the diagnosis, prevention, and treatment targets of PCOS-associated AS.
What problem does this paper attempt to address?