Transcriptome variation in human tissues revealed by long-read sequencing
Dafni A. Glinos,Garrett Garborcauskas,Paul Hoffman,Nava Ehsan,Lihua Jiang,Alper Gokden,Xiaoguang Dai,François Aguet,Kathleen L. Brown,Kiran Garimella,Tera Bowers,Maura Costello,Kristin Ardlie,Ruiqi Jian,Nathan R. Tucker,Patrick T. Ellinor,Eoghan D. Harrington,Hua Tang,Michael Snyder,Sissel Juul,Pejman Mohammadi,Daniel G. MacArthur,Tuuli Lappalainen,Beryl B. Cummings
DOI: https://doi.org/10.1038/s41586-022-05035-y
IF: 64.8
2022-08-04
Nature
Abstract:Regulation of transcript structure generates transcript diversity and plays an important role in human disease 1,2,3,4,5,6,7 . The advent of long-read sequencing technologies offers the opportunity to study the role of genetic variation in transcript structure 8,9,10,11,12,13,14,15,16 . In this Article, we present a large human long-read RNA-seq dataset using the Oxford Nanopore Technologies platform from 88 samples from Genotype-Tissue Expression (GTEx) tissues and cell lines, complementing the GTEx resource. We identified just over 70,000 novel transcripts for annotated genes, and validated the protein expression of 10% of novel transcripts. We developed a new computational package, LORALS, to analyse the genetic effects of rare and common variants on the transcriptome by allele-specific analysis of long reads. We characterized allele-specific expression and transcript structure events, providing new insights into the specific transcript alterations caused by common and rare genetic variants and highlighting the resolution gained from long-read data. We were able to perturb the transcript structure upon knockdown of PTBP1, an RNA binding protein that mediates splicing, thereby finding genetic regulatory effects that are modified by the cellular environment. Finally, we used this dataset to enhance variant interpretation and study rare variants leading to aberrant splicing patterns.
multidisciplinary sciences