Integrated genomics analysis highlights important SNPs and genes implicated in moderate-to-severe asthma based on GWAS and eQTL datasets

Zhouzhou Dong,Yunlong Ma,Hua Zhou,Linhui Shi,Gongjie Ye,Lei Yang,Panpan Liu,Li Zhou
DOI: https://doi.org/10.1186/s12890-020-01303-7
IF: 3.1
2020-10-16
BMC Pulmonary Medicine
Abstract:Abstract Background Severe asthma is a chronic disease contributing to disproportionate disease morbidity and mortality. From the year of 2007, many genome-wide association studies (GWAS) have documented a large number of asthma-associated genetic variants and related genes. Nevertheless, the molecular mechanism of these identified variants involved in asthma or severe asthma risk remains largely unknown. Methods In the current study, we systematically integrated 3 independent expression quantitative trait loci (eQTL) data ( N = 1977) and a large-scale GWAS summary data of moderate-to-severe asthma ( N = 30,810) by using the Sherlock Bayesian analysis to identify whether expression-related variants contribute risk to severe asthma. Furthermore, we performed various bioinformatics analyses, including pathway enrichment analysis, PPI network enrichment analysis, in silico permutation analysis, DEG analysis and co-expression analysis, to prioritize important genes associated with severe asthma. Results In the discovery stage, we identified 1129 significant genes associated with moderate-to-severe asthma by using the Sherlock Bayesian analysis. Two hundred twenty-eight genes were prominently replicated by using MAGMA gene-based analysis. These 228 replicated genes were enriched in 17 biological pathways including antigen processing and presentation (Corrected P = 4.30 × 10 − 6 ), type I diabetes mellitus (Corrected P = 7.09 × 10 − 5 ), and asthma (Corrected P = 1.72 × 10 − 3 ). With the use of a series of bioinformatics analyses, we highlighted 11 important genes such as GNGT2 , TLR6 , and TTC19 as authentic risk genes associated with moderate-to-severe/severe asthma. With respect to GNGT2 , there were 3 eSNPs of rs17637472 (P eQTL = 2.98 × 10 − 8 and P GWAS = 3.40 × 10 − 8 ), rs11265180 (P eQTL = 6.0 × 10 − 6 and P GWAS = 1.99 × 10 − 3 ), and rs1867087 (P eQTL = 1.0 × 10 − 4 and P GWAS = 1.84 × 10 − 5 ) identified. In addition, GNGT2 is significantly expressed in severe asthma compared with mild-moderate asthma ( P = 0.045), and Gngt2 shows significantly distinct expression patterns between vehicle and various glucocorticoids (Anova P = 1.55 × 10 − 6 ). Conclusions Our current study provides multiple lines of evidence to support that these 11 identified genes as important candidates implicated in the pathogenesis of severe asthma.
respiratory system
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to identify single - nucleotide polymorphisms (SNPs) and genes associated with moderate - to - severe asthma through integrated genomics analysis. Specifically, the researchers hope to solve the following problems through the following methods: 1. **Identify genetic variations associated with moderate - to - severe asthma**: - By integrating gene expression quantitative trait loci (eQTL) data and large - scale genome - wide association study (GWAS) data, identify genetic variations that may affect the risk of moderate - to - severe asthma. - Use the Sherlock Bayesian analysis method, combined with GWAS and eQTL data, to identify whether expression - related variations contribute to the risk of moderate - to - severe asthma. 2. **Reveal the molecular mechanisms of these genetic variations**: - Through a variety of bioinformatics analyses, such as pathway enrichment analysis, protein - protein interaction network analysis, differential gene expression analysis and co - expression analysis, prioritize genes associated with moderate - to - severe asthma. - Validate the biological roles of these genes in different datasets to ensure the reliability and reproducibility of the results. 3. **Expand the understanding of the genetic determinants of moderate - to - severe asthma**: - Through systematic integration of multiple independent eQTL datasets, validate potential biological roles and further confirm the importance of these genes in the pathogenesis of moderate - to - severe asthma. - Identify new risk genes that may be difficult to detect in traditional GWAS but can be found through integrated analysis. ### Main research methods 1. **GWAS datasets**: - Use the large - scale moderate - to - severe asthma GWAS dataset (N = 30,810) reported by Shrine et al., including 5,135 moderate - to - severe asthma patients and 25,675 controls. - Construct an asthma GWAS dataset with a random phenotype as a negative control to ensure that the identified genes are caused by genetic biology rather than artificial factors. 2. **eQTL datasets**: - Use three independent eQTL datasets (Zeller et al., Dixon et al., and Duan et al.), containing 1,490, 400 and 87 samples respectively, for validating the identified risk genes. 3. **Sherlock Bayesian analysis**: - Integrate GWAS data and eQTL data and use the Sherlock algorithm to identify genes associated with the risk of moderate - to - severe asthma. - Predict asthma - related risk genes by calculating the Bayesian factor (LBF) of each gene. 4. **Bioinformatics analysis**: - Conduct gene - level enrichment analysis, use the MAGMA tool combined with linkage disequilibrium (LD) information to identify the convergent effects of multiple variants. - Conduct pathway enrichment analysis, use the ClueGO plugin and KEGG database to identify significant functional links. - Conduct protein - protein interaction network analysis, use the GeneMANIA tool to generate sub - networks. - Compare the overlapping genes between different datasets and conduct permutation analysis to evaluate the significance of the overlapping genes. ### Results 1. **Discovery phase**: - Through Sherlock Bayesian analysis, 1,129 significant genes associated with moderate - to - severe asthma were identified. - Among them, 228 genes were significantly validated in the MAGMA gene - level analysis. 2. **Pathway enrichment analysis**: - These genes were significantly enriched in 17 biological pathways, including antigen processing and presentation (Corrected P = 4.30×10⁻⁶), type 1 diabetes (Corrected P = 7.09×10⁻⁵) and asthma (Corrected P = 1.72×10⁻³). 3. **Validation phase**: - Using two independent eQTL datasets (Dixon et al. and Duan et al.), re - conduct Sherlock analysis and identify 964 and 771 significant or suggestive risk genes respectively. - Through permutation analysis, it was found that the genes identified in the discovery phase had significant overlap with the genes in the independent datasets.