Machine Learning and Bioinformatics Unravel Gene Signatures of Coronary Artery Disease Comorbidity with Periodontitis
Nannan Yang,Wenqian Yang,Kun Wang,Liguo Tan,Hao Zhang,Zhen Huang,Qi Chen,Huanhuan Xing,Ying JIN,Wenting Liu,Shaobing Wang
DOI: https://doi.org/10.1101/2024.09.18.24313934
2024-09-22
Abstract:Abstract
Background: Emerging evidence suggests a complex interplay between periodontal disease (PD) and coronary artery disease (CAD) risk. The novel insights into the shared pathogenesis of PD and CAD will potentially inform future therapeutic strategies. This study aimed to identify signature genes implicated in the progression of PD to CAD.
Methods: Gene expression data from NCBI GEO datasets, GSE10334 and GSE66360, associated with both PD and CAD datasets were analyzed to pinpoint differentially expressed genes (DEGs), followed by weighted gene co-expression network analysis (WGCNA) to identify key modules. Functional enrichment analysis of common DEGs was conducted. Four machine learning algorithms were employed to construct predictive models, and the optimal model was selected for subsequent feature genes selection. Using GSE6751 and GSE71226 as validation cohort, receiver operating characteristic (ROC) curves and nomograms were generated for diagnostic performance assessment and risk prediction. Furthermore, immune cell infiltration patterns were assessed using the CIBERSORT (Cell-type Identification By Estimating Relative Subsets Of RNA Transcripts) algorithm. Finally, RNA-sequencing (RNA-seq) of 5 clinical samples vs. 5 controls was performed to validate the identified genes and explore their potential as biomarkers for early diagnosis and prevention of comorbid periodontitis and CAD.
Results: Analysis of the GSE10334 and GSE66360 datasets revealed 48 common Differentially Expressed Genes (DEGs) associated with CAD and PD. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses of these DEGs highlighted significant overrepresentation of pathways related to inflammatory responses and immune cell trafficking, including response to lipopolysaccharides, molecules of bacterial origin, neutrophil migration, bone marrow leukocyte migration, and CXCR chemokine receptor binding. Additionally, pathways involved in lipid metabolism and atherosclerosis, such as the NF-?B signaling pathway, IL-17 signaling pathway, and TNF signaling pathway, were also enriched. Five genes (FOS, MME, PECAM1, RGS1, and VNN2) emerged as potential signature genes, demonstrating strong predictive ability with an area under the curve (AUC) greater than 0.7 on the machine learning algorithms. CIBERSORT analysis suggested a potential role of these signature genes in modulating immune cell infiltration. To further validate these findings, RNA-seq on clinical samples confirmed significant upregulation of FOS, VNN2, PECAM1, and MME genes in patients with both CAD and PD.
Conclusion: This study identified five signature genes that were significantly associated with immune cell dysregulation, where four of them were verified on clinical samples. These genes hold promise for the development of a nomogram-based approach for early diagnosis of both periodontitis and coronary artery disease, potentially informing future research directions for improved diagnosis and treatment strategies in these prevalent conditions. Notably, the prominent upregulation of FOS suggests its potential as a key target for future investigations. These insights hold significant implications for improving prevention and diagnostic strategies for individuals affected by both PD and CAD.