Functional clustering of splice-altering variants in whole genome sequencing data reveals hidden heritability in rare variant disorder

Yan Wang,Charlotte van Dijk,Ilia Timpanaro,Paul J. Hop,Brendan Kenna,Maarten Kooyman,Eleonora Aronica,R. Jeroen Pasterkamp,Leonard van den Berg,Johnathan Cooper-Knock,Project MinE ALS sequencing consortium,NYGC ALS consortium,Jan Veldink,Kevin Kenna
DOI: https://doi.org/10.1101/2023.05.30.542855
2024-10-16
Abstract:Explaining missing heritability in rare disorders requires effective methods to interpret genetic variants. Sequence-to-function models such as SpliceAI support discovery of splice altering variants but filtering their output to identify pathogenic mutations remains challenging. We developed SpliPath to address 2 unmet needs in this process. First, SpliPath links the output of SpliceAI with reference transcriptomics data. This allows users to identify genetic variants that induce unannotated splice isoforms selectively expressed in disease models or patient tissue. Second, SpliPath aggregates variants with similar functional consequences into collapsed splicing quantitative trait loci (csQTLs) for more powerful genetic association analyses. We first used SpliPath to annotate whole genome sequencing (WGS) of 9,467 ALS patients and controls using RNAseq data from an iPSC model of TDP-43 dysfunction. Through this, SpliPath identified 53 variants predicted to enhance cryptic exon (CE) retention events associated with a core ALS pathomechanism. We then applied SpliPath to 294 ALS patients where both WGS and RNAseq were available and discovered missing genetic risk in the known ALS gene KIF5A. This revealed a first of kind intronic mutation hotspot that was validated using minigene reporter assays. Finally, using the same RNAseq data we then predicted 754 candidate csQTL for an independent WGS cohort of 6,625 ALS patients and 2,472 controls. Unbiased genomewide csQTL association testing successfully recovered KIF5A and nominated EPG5 as a potential pathogenic gene. These effects were undetectable using simplistic SpliceAI gene burden tests. Collectively, our study demonstrates the utility of SpliPath for uncovering missing heritability in rare disorders.
Bioinformatics
What problem does this paper attempt to address?