G-quadruplex Structural Variations in Human Genome Associated with Single-Nucleotide Variations and Their Impact on Gene Activity.

Jia-yuan Gong,Cui-jiao Wen,Ming-liang Tang,Rui-fang Duan,Juan-nan Chen,Jia-yu Zhang,Ke-wei Zheng,Yi-de He,Yu-hua Hao,Qun Yu,Su-ping Ren,Zheng Tan
DOI: https://doi.org/10.1073/pnas.2013230118
2021-01-01
Abstract:G-quadruplexes (G4s) formed by guanine-rich nucleic acids play a role in essential biological processes such as transcription and replication. Besides the >1.5 million putative G-4-forming sequences (PQSs), the human genome features >640 million single-nucleotide variations (SNVs), the most common type of genetic variation among people or populations. An SNV may alter a G4 structure when it falls within a PQS motif. To date, genome-wide PQS-SNV interactions and their impact have not been investigated. Herein, we present a study on the PQS-SNV interactions and the impact they can bring to G4 structures and, subsequently, gene expressions. Based on build 154 of the Single Nucleotide Polymorphism Database (dbSNP), we identified 5 million gains/losses or structural conversions of G4s that can be caused by the SNVs. Of these G4 variations (G4Vs), 3.4 million are within genes, resulting in an average load of >120 G4Vs per gene, preferentially enriched near the transcription start site. Moreover, >80% of the G4Vs overlap with transcription factor-binding sites and >14% with enhancers, giving an average load of 3 and 7.5 for the two regulatory elements, respectively. Our experiments show that such G4Vs can significantly influence the expression of their host genes. These results reveal genome-wide G4Vs and their impact on gene activity, emphasizing an understanding of genetic variation, from a structural perspective, of their physiological function and pathological implications. The G4Vs may also provide a unique category of drug targets for individualized therapeutics, health risk assessment, and drug development.
What problem does this paper attempt to address?