A High Quality Arabidopsis Transcriptome for Accurate Transcript-Level Analysis of Alternative Splicing
Runxuan Zhang,Cristiane P. G. Calixto,Yamile Marquez,Peter Venhuizen,Nikoleta A. Tzioutziou,Wenbin Guo,Mark Spensley,Juan Carlos Entizne,Dominika Lewandowska,Sara ten Have,Nicolas Frei Dit Frey,Heribert Hirt,Allan B. James,Hugh G. Nimmo,Andrea Barta,Maria Kalyna,John W. S. Brown
DOI: https://doi.org/10.1093/nar/gkx267
IF: 14.9
2017-01-01
Nucleic Acids Research
Abstract:Alternative splicing generates multiple transcript and protein isoforms from the same gene and thus is important in gene expression regulation. To date, RNA-sequencing (RNA-seq) is the standard method for quantifying changes in alternative splicing on a genome-wide scale. Understanding the current limitations of RNA-seq is crucial for reliable analysis and the lack of high quality, comprehensive transcriptomes for most species, including model organisms such as Arabidopsis, is a major constraint in accurate quantification of transcript isoforms. To address this, we designed a novel pipeline with stringent filters and assembled a comprehensive Reference Transcript Dataset for Arabidopsis (AtRTD2) containing 82,190 non-redundant transcripts from 34 212 genes. Extensive experimental validation showed that AtRTD2 and its modified version, AtRTD2-QUASI, for use in Quantification of Alternatively Spliced Isoforms, outperform other available transcriptomes in RNA-seq analysis. This strategy can be implemented in other species to build a pipeline for transcript-level expression and alternative splicing analyses.