Identification of common mechanisms and biomarkers of atrial fibrillation and heart failure based on machine learning

Zhijun Zhang,Jianying Ding,Xiaolong Mi,Yuanyuan Lin,Xinjian Li,Jun Lian,Jinwen Liu,Lijuan Qu,Bingye Zhao,Xuewen Li
DOI: https://doi.org/10.1002/ehf2.14799
2024-04-26
ESC Heart Failure
Abstract:Aims Atrial fibrillation (AF) is the most common arrhythmia. Heart failure (HF) is a disease caused by heart dysfunction. The prevalence of AF and HF were progressively increasing over time. The co‐existence of AF and HF presents a significant therapeutic challenge. In order to provide new ideas for the diagnosis of AF and HF, it is necessary to carry out biomarker related studies. Methods and results The training set and validation set data of AF and HF patient samples were downloaded from the GEO database, 'limma' was used to compare the differences in gene expression levels between the disease group and the normal group to screen for differentially expressed genes (DEGs). Weighted correlation network analysis (WGCNA) identified the modules with the highest positive correlation with AF and HF. Functional enrichment and PPI network construction of key genes were carried out. Biomarkers were screened by machine learning. The infiltration of immune cells in AF and HF groups was evaluated by R‐packet 'CIBERSORT'. The miRNA network was constructed and potential therapeutic agents for biomarker genes were predicted through the drugbank database. Through WGCNA analysis, it was found that the modules most positively correlated with AF and HF were MEturquoise (r = 0.21, P value = 0.09) and MEbrown (r = 0.62, P value = 8e‐12), respectively. We screened 25 genes that were highly correlated with both AF and HF. Lasso regression analysis results showed 7 and 20 core genes in AF and HF groups, respectively. The top 20 important genes in AF and HF groups were obtained as core genes by RF model analysis. Four biomarkers were obtained after the intersection of core genes in four groups, namely, GLUL, NCF2, S100A12, and SRGN. The diagnostic efficacy of four genes in AF validation sets was good (AUC: GLUL 0.76, NCF2 0.64, S100A12 0.68, and SRGN 0.76), as well as in the HF validation set (AUC: GLUL 0.76, NCF2 0.84, S100A12 0.92, and SRGN 0.68). The highest correlation with neutrophils was observed for GLUL, NCF2, and S100A12, while SRGN exhibited the strongest correlation with T cells CD4 memory resting in the AF group. GLUL, NCF2, S100A12, and SRGN were most associated with neutrophils in the HF group. A total of 101 miRNAs were predicted by four genes, and GLUL, NCF2, and S100A12 predicted a total of 10 potential therapeutic agents. Conclusions We identified four biological markers that are highly correlated with AF and HF, namely, GLUL, NCF2, S100A12, and SRGN. Our findings provide theoretical basis for the clinical diagnosis and treatment of AF and HF.
cardiac & cardiovascular systems
What problem does this paper attempt to address?