A basic framework governing splice-site choice in eukaryotes
Craig I Dent,Stefan Prodic,Aiswarya Balakrishnan,James Georges,Aaryan Chhabra,Sourav Mukherjee,Jordyn Coutts,Michael Gitonobel,Rucha D Sarwade,Joseph Rosenbluh,Mauro D’Amato,Partha P Das,Ya-Long Guo,Alexandre Fournier-Level,Richard Burke,Sridevi Sureshkumar,David Powell,Sureshkumar Balasubramanian
DOI: https://doi.org/10.1101/2024.03.21.586179
2024-03-27
Abstract:Changes in splicing are observed between cells, tissues, organs, individuals, and species. These changes can mediate phenotypic variation ranging from flowering time differences in plants to genetic diseases in humans. However, the genomic determinants of splicing variation are largely unknown. Here, we quantified the usage of individual splice-sites and uncover extensive variation between individuals (genotypes) in Arabidopsis, Drosophila and Humans. We used this robust quantitative measure as a phenotype and mapped variation in splice-site usage using Genome-Wide Association Studies (GWAS). By carrying out more than 130,000 GWAS with splice-site usage phenotypes, we reveal genetic variants associated with differential usage of specific splice-sites. Our analysis conclusively shows that most of the common, genetically controlled variation in splicing is and there are no major hotspots in any of the three analyzed species. High-resolution mapping allowed us to determine genome-wide patterns that govern splice-site choice. We reveal that the variability in the intronic hexamer sequence (GT[N] or [N] AG) differentiates intrinsic splice-site strength and is among the primary determinants of splice-site choice. Experimental analysis validates the primary role for intronic hexamer sequences in conferring splice-site decisions. Transcriptome analyses in diverse species across the tree of life reveals that hexamer rankings explains splice-site choices from yeast to plants to humans, forming the basic framework of the splicing code in eukaryotes.
Genetics