Complete analysis of G-quadruplex forming sequences in the gapless assembly of human chromosome Y

Michaela Dobrovolná,Jean Louis Mergny,Václav Brázda
DOI: https://doi.org/10.1016/j.biochi.2024.10.007
IF: 4.372
2024-10-11
Biochimie
Abstract:Recent advancements have finally delivered a complete human genome assembly, including the elusive Y chromosome. This accomplishment closes a significant knowledge gap. Prior efforts were hampered by challenges in sequencing repetitive DNA structures such as direct and inverted repeats. We used the G4Hunter algorithm to analyze the presence of G-quadruplex forming sequences (G4s) within the current human reference genome (GRCh38) and the new telomere-to-telomere (T2T) Y chromosome assemblies. This analysis served a dual purpose: identifying the location of potential G4s within the genomes and exploring their association with functionally annotated sequences. Compared to GRCh38, the T2T assembly exhibited a significantly higher prevalence of G-quadruplex forming sequences. Notably, these repeats were abundantly located around precursor RNA, exons, genes, and within protein binding sites. This remarkable co-occurrence of G4-forming sequences with these critical regulatory regions suggests their role in fundamental DNA regulation processes. Our findings indicate that the current human reference genome significantly underestimated the number of G4s, potentially overlooking their functional importance.
biochemistry & molecular biology
What problem does this paper attempt to address?