Partial correlation network analysis identifies coordinated gene expression within a regional cluster of COPD genome-wide association signals

Michele Gentili,Kimberly Glass,Enrico Maiorino,Brian D. Hobbs,Zhonghui Xu,Peter J. Castaldi,Michael H. Cho,Craig P. Hersh,Dandi Qiao,Jarrett D. Morrow,Vincent J. Carey,John Platig,Edwin K. Silverman
DOI: https://doi.org/10.1371/journal.pcbi.1011079
2024-10-18
PLoS Computational Biology
Abstract:Chronic obstructive pulmonary disease (COPD) is a complex disease influenced by well-established environmental exposures (most notably, cigarette smoking) and incompletely defined genetic factors. The chromosome 4q region harbors multiple genetic risk loci for COPD, including signals near HHIP , FAM13A , GSTCD , TET2 , and BTC . Leveraging RNA-Seq data from lung tissue in COPD cases and controls, we estimated the co-expression network for genes in the 4q region bounded by HHIP and BTC (~70MB), through partial correlations informed by protein-protein interactions. We identified several co-expressed gene pairs based on partial correlations, including NPNT-HHIP , BTC - NPNT and FAM13A - TET2 , which were replicated in independent lung tissue cohorts. Upon clustering the co-expression network, we observed that four genes previously associated to COPD: BTC , HHIP , NPNT and PPM1K appeared in the same network community. Finally, we discovered a sub-network of genes differentially co-expressed between COPD vs controls (including FAM13A , PPA2 , PPM1K and TET2) . Many of these genes were previously implicated in cell-based knock-out experiments, including the knocking out of SPP1 which belongs to the same genomic region and could be a potential local key regulatory gene. These analyses identify chromosome 4q as a region enriched for COPD genetic susceptibility and differential co-expression. Complex diseases, such as chronic obstructive pulmonary disease (COPD), are characterized by multifactorial causes, including multiple genetic variants and a variety of molecular functions. A 70 megabase genomic region on chromosome 4 harbors some of the strongest genetic associations to COPD based on genome-wide association studies (GWAS). In this work we study the co-expression patterns of genes located in this genomic region. We developed a new approach to compute partial correlations between pairs of expressed genes, using prior information regarding the mediating genes from protein-protein interaction networks. We identified potential biological connections between several of the COPD-related GWAS genes in this region, including BTC , NPNT , PPM1K and HHIP , suggesting possible functional co-regulation. Furthermore, by comparing COPD cases and control subjects, we found multiple network edges whose co-expression changed between health and disease, particularly the edge between CXCL10 and CXCL11 . Some of these genes were previously implicated in cell-based knock-out experiments, suggesting a common regulator, namely SPP1 . These analyses provide insight into regional gene regulation of GWAS genes that may be related to COPD pathogenesis.
biochemical research methods,mathematical & computational biology
What problem does this paper attempt to address?