Single-molecule Real-time (SMRT) Isoform Sequencing (Iso-Seq) in Plants: the Status of the Bioinformatics Tools to Unravel the Transcriptome Complexity

Yubang Gao,Feihu Xi,Hangxiao Zhang,Xuqing Liu,Huiyuan Wang,Liangzhen Zhao,Anireddy S. N. Reddy,Lianfeng Gu
DOI: https://doi.org/10.2174/1574893614666190204151746
2019-01-01
Current Bioinformatics
Abstract:Background: The advent of the Single-Molecule Real-time (SMRT) Isoform Sequencing (Iso-Seq) has paved the way to obtain longer full-length transcripts. This method was found to be much superior in identifying full-length splice variants and other post-transcriptional events as compared to the Next Generation Sequencing (NGS)-based short read sequencing (RNA-Scq). Several different bioinformatics tools to analyze the Iso-Seq data have been developed and some of them are still being refined to address different aspects of transcriptome complexity. However, a comprehensive summary of the available tools and their utility is still lacking. Objective: Here, we summarized the existing Iso-Seq analysis tools and presented an integrated bioinformatics pipeline for Iso-Seq analysis, which overcomes the limitations of NGS and generates long contiguous Full-Length Non-Chimeric (FLNC) reads for the analysis of post-transcriptional events. Results: In this review, we summarized recent applications of Iso-Seq in plants, which include improved genome annotations, identification of novel genes and lncRNAs, identification of full-length splice isoforms, detection of novel Alternative Splicing (AS) and Alternative Polyadenylation (APA) events. In addition, we also discussed the bioinformatics pipeline for comprehensive Iso-Seq data analysis, including how to reduce the error rate in the reads and how to identify and quantify post-transcriptional events. Furthermore, the visualization approach of Iso-Seq was discussed as well. Finally, we discussed methods to combine Iso-Seq data with RNA-Seq for transcriptomc quantification. Conclusion: Overall, this review demonstrates that the Iso-Seq is pivotal for analyzing transcriptome complexity and this new method offers unprecedented opportunities to comprehensively understand transcripts diversity.
What problem does this paper attempt to address?