NMDtxDB: Data-driven identification and annotation of human NMD target transcripts

Thiago Britto-Borges,Niels Gehring,Volker Boehm,Christoph Dieterich
DOI: https://doi.org/10.1101/2024.01.31.578146
2024-02-02
Abstract:The nonsense-mediated RNA decay (NMD) pathway is a crucial mechanism of mRNA quality control. Current annotations of NMD substrate RNAs are rarely data-driven, but use general established rules. We introduce a dataset with 4 cell lines and combinations for SMG5, SMG6 and SMG7 knockdowns or SMG7 knockout. Based on this dataset, we implemented a workflow that combines Nanopore and Illumina sequencing to assemble a transcriptome, which is enriched for NMD target transcripts. Moreover, we use coding sequence information from Ensembl, Gencode consensus RiboSeq ORFs and OpenProt to enhance the CDS annotation of novel transcript isoforms. 302,889 transcripts were obtained from the transcriptome assembly process, out of which, 48,213 contain a premature stop codon and 6,433 are significantly up regulated in three or more comparisons of NMD active vs deficient cell lines. We present an in-depth view on these results through the NMDtxDB database, which is available at , and supports the study of NMD-sensitive transcripts. We open sourced our implementation of the respective web-application and analysis workflow at and .
Bioinformatics
What problem does this paper attempt to address?