NanoTrans: an integrated computational framework for comprehensive transcriptome analysis with Nanopore direct RNA sequencing

Ludong Yang,Xinxin Zhang,Fan Wang,Li Zhang,Jing Li,Jia-Xing Yue
DOI: https://doi.org/10.1101/2022.11.29.518309
2024-04-27
Abstract:Nanopore direct RNA sequencing (DRS) provides the direct access to native RNA strands with full-length information, shedding light on rich qualitative and quantitative properties of gene expression profiles. Here with NanoTrans, we present an integrated computational framework that comprehensively covers all major DRS-based application scopes, including isoform clustering and quantification, poly(A) tail length estimation, RNA modification profiling, and fusion gene detection. In addition to its merit in providing such a streamlined one-stop solution, NanoTrans also shines in its workflow-orientated modular design, batch processing capability, all-in-one tabular and graphic report output, as well as automatic installation and configuration supports. Finally, by applying NanoTrans to real DRS datasets of yeast, , as well as human embryonic kidney and cancer cell lines, we further demonstrated its utility, effectiveness, and efficacy across a wide range of DRS-based application settings.
Bioinformatics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the limitations in the functionality and practicality of existing Nanopore direct RNA sequencing (DRS) data analysis tools. Specifically, although there are currently a variety of bioinformatics tools specifically designed for analyzing Nanopore DRS data, such as Mandalorion, Flair, NanoCount, nanopolish, tailfindr, EpiNano, Xpore, LongGF, and JAFFAL, etc., these tools usually focus on specific application areas and lack a unified framework to integrate the advantages of these different tools. In addition, existing pipelines such as MasterOfPores and FASTdRNA, although they have solved this problem to a certain extent, still have problems such as incomplete functionality, high complexity in use, and low computational efficiency. To solve these problems, the paper introduces NanoTrans, which is an integrated computational framework aiming to provide a comprehensive Nanopore DRS data analysis solution. NanoTrans covers the main application scope of DRS technology, including isoform clustering and quantification, poly(A) tail length estimation, RNA modification analysis, and fusion gene detection. Moreover, NanoTrans also has the advantages of modular design, batch - processing ability, one - click installation, and configuration support, making it more user - friendly and efficient. Through applications on multiple actual data sets such as yeast, Arabidopsis thaliana, human embryonic kidney cell lines, and cancer cell lines, the paper demonstrates the practicality and effectiveness of NanoTrans in various DRS application scenarios.