Quadrupia: Derivation of G-quadruplexes for organismal genomes across the tree of life
Nikol Chantzi,Akshatha Nayak,Fotis A. Baltoumas,Eleni Aplakidou,Shiau Wei Liew,Jesslyn Elvaretta Galuh,Michail Patsakis,Camille Moeckel,Ioannis Mouratidis,Saiful Arefeen Sazed,Wilfried Guiblet,Austin Montgomery,Panagiotis Karmiris-Obratanski,Wang Guliang,Apostolos Zaravinos,Karen M Vasquez,Chun Kit Kwok,Georgios Pavlopoulos,Ilias Georgakopoulos-Soares
DOI: https://doi.org/10.1101/2024.07.09.602008
2024-07-11
Abstract:G-quadruplex DNA structures exhibit a profound influence on essential biological processes, including transcription, replication, telomere maintenance, and genomic stability. These structures have demonstrably shaped organismal evolution. However, a comprehensive, organism-wide G-quadruplex map encompassing the diversity of life has remained elusive. Here, we introduce Quadrupia, the most extensive and well-characterized G-quadruplex database to date, facilitating the exploration of G-quadruplex structures across the evolutionary spectrum. Quadrupia has identified G-quadruplex sequences in 108,449 reference genomes, with a total of 140,181,277 G-quadruplexes. The database also hosts a collection of 319,784 G-quadruplex clusters of 20 or more members, annotated by taxonomic distributions, multiple sequence alignments, profile Hidden Markov Models and cross-references to G-quadruplex 3D structures. Examination of G-quadruplexes across functional genomic elements in different taxa indicates preferential orientation and positioning, with significant differences between individual taxonomic groups. For example, we find that G-quadruplexes in bacteria with a single replication origin display profound preference for the leading orientation. Finally, we experimentally validate the most frequently observed G-quadruplexes using CD-spectroscopy, UV melting, and fluorescent-based approaches. Quadrupia is publicly available through https://www.pavlopoulos-lab.org/quadrupia.
Genomics