IsoTree: A New Framework for De Novo Transcriptome Assembly from RNA-seq Reads

Jin Zhao,Haodi Feng,Daming Zhu,Chi Zhang,Ying Xu
DOI: https://doi.org/10.1109/tcbb.2018.2808350
2018-01-01
IEEE/ACM Transactions on Computational Biology and Bioinformatics
Abstract:High-throughput sequencing of mRNA has made the deep and efficient probing of transcriptome more affordable. However, the vast amounts of short RNA-seq reads make de novo transcriptome assembly an algorithmic challenge. In this work, we present IsoTree, a novel framework for transcripts reconstruction in the absence of reference genomes. Unlike most of de novo assembly methods that build de Bruijn graph or splicing graph by connecting k- mers which are sets of overlapping substrings generated from reads, IsoTree constructs splicing graph by connecting reads directly. For each splicing graph, IsoTree applies an iterative scheme of mixed integer linear program to build a prefix tree, called isoform tree. Each path from the root node of the isoform tree to a leaf node represents a plausible transcript candidate which will be pruned based on the information of paired-end reads. Experiments showed that in most cases IsoTree performs better than other leading transcriptome assembly programs. IsoTree is available at https://github.com/Jane110111107/IsoTree.
What problem does this paper attempt to address?