Effects of transcriptional noise on estimates of gene and transcript expression in RNA sequencing experiments

Ales Varabyou,Steven Lloyd Salzberg,Mihaela Pertea,Steven L. Salzberg
DOI: https://doi.org/10.1101/gr.266213.120
IF: 9.438
2020-12-23
Genome Research
Abstract:RNA sequencing is widely used to measure gene expression across a vast range of animal and plant tissues and conditions. Most studies of computational methods for gene expression analysis use simulated data to evaluate the accuracy of these methods. These simulations typically include reads generated from known genes at varying levels of expression. Until now, simulations did not include reads from noisy transcripts, which might include erroneous transcription, erroneous splicing, and other processes that affect transcription in living cells. Here we examine the effects of realistic amounts of transcriptional noise on the ability of leading computational methods to assemble and quantify the genes and transcripts in an RNA sequencing experiment. We show that the inclusion of noise leads to systematic errors in the ability of these programs to measure expression, including systematic underestimates of transcript abundance levels and large increases in the number of false-positive genes and transcripts. Our results also suggest that alignment-free computational methods sometimes fail to detect transcripts expressed at relatively low levels.
genetics & heredity,biochemistry & molecular biology,biotechnology & applied microbiology
What problem does this paper attempt to address?