Long G4-rich enhancer physically interacts with EXOC3 promoter via a G4:G4 DNA-based mechanism
Jeffrey D DeMeis,Justin T Roberts,Haley A Delcher,Noel L Godang,Alexander B Coley,Cana L Brown,Michael H Shaw,Sayema Naaz,Enas S Alsatari,Ayush Dahal,Shahem Y Alqudah,Kevin N Nguyen,Anita D Nguyen,Sunita S Paudel,Hong Dang,Wanda K. O’Neal,Michael R. Knowles,Dominika Houserova,Mark N Gillespie,Glen M Borchert
DOI: https://doi.org/10.1101/2024.01.29.577212
2024-01-31
Abstract:Enhancers are genomic sequences that function as regulatory elements capable of increasing the transcription of a given gene often located at a considerable distance. The broadly accepted model of enhancer activation involves bringing an enhancer-bound activator protein complex into close spatial proximity to its target promoter through chromatin looping. Equally relevant to the work described herein, roles for guanine (G) rich sequences in transcriptional regulation are now widely accepted. Non-coding G-rich sequences are commonly found in gene promoters and enhancers, and various studies have described specific instances where G-rich sequences regulate gene expression via their capacity to form G-quadruplex (G4) structures under physiological conditions. In light of this, our group previously performed a search for long human genomic stretches significantly enriched for minimal G4 motifs (referred to as LG4s herein) leading to the identification of 301 LG4 loci with a density of at least 80 GGG repeats / 1,000 basepairs (bp) and averaging 1,843 bp in length. Further, in agreement with previous reports indicating that minimal G4s are highly enriched in promoters and enhancers, we found 217/301 LG4 sequences overlap a GeneHancer annotated enhancer, and the gene promoters regulated by these LG4 enhancers were found to be similarly, markedly enriched with G4-capable sequences. Importantly, while the generally accepted model for enhancer:promoter specificity maintains that interactions are dictated by enhancer- and promoter-bound transcriptional activator proteins, the current study was designed to test an alternative hypothesis: that LG4 enhancers physically interact with their cognate promoters via a direct G4:G4 DNA-based mechanism. As such, this work employs a combination of informatic mining and locus-specific immunoprecipitation strategies to establish the spatial proximity of enhancer:promoter pairs within the nucleus then biochemically confirms the ability of individual LG4 ssDNAs to directly and specifically interact with DNA sequences found in their target promoters. In addition, we also identify four single nucleotide polymorphisms (SNPs), occurring within a LG4 enhancer on human chromosome 5, significantly associated with Cystic Fibrosis (CF) lung disease severity (avg. p value = 2.83E-9), presumably due to their effects on the expressions of CF-relevant genes directly regulated by this LG4 enhancer (e.g., EXOC3 and CEP72).
Genetics