A Deep Dive into Statistical Modeling of RNA Splicing QTLs Reveals New Variants that Explain Neurodegenerative Disease

David Wang,Matthew R. Gazzara,San Jewell,Benjamin Wales-McGrath,Christopher D. Brown,Peter S. Choi,Yoseph Barash
DOI: https://doi.org/10.1101/2024.09.01.610696
2024-09-03
Abstract:Genome-wide association studies (GWAS) have identified thousands of putative disease causing variants with unknown regulatory effects. Efforts to connect these variants with splicing quantitative trait loci (sQTLs) have provided functional insights, yet sQTLs reported by existing methods cannot explain many GWAS signals. We show current sQTL modeling approaches can be improved by considering alternative splicing representation, model calibration, and covariate integration. We then introduce MAJIQTL, a new pipeline for sQTL discovery. MAJIQTL includes two new statistical methods: a weighted multiple testing approach for sGene discovery and a model for sQTL effect size inference to improve variant prioritization. By applying MAJIQTL to GTEx, we find significantly more sGenes harboring sQTLs with functional significance. Notably, our analysis implicates the novel variant rs582283 in Alzheimer’s disease. Using antisense oligonucleotides, we validate this variant’s effect by blocking the implicated YBX3 binding site, leading to exon skipping in the gene MS4A3.
Genomics
What problem does this paper attempt to address?