Abstract:BACKGROUND:The advent of the NGS technologies has permitted profiling of whole-genome transcriptomes (i.e., RNA-Seq) at unprecedented speed and very low cost. RNA-Seq provides a far more precise measurement of transcript levels and their isoforms compared to other methods such as microarrays. A fundamental goal of RNA-Seq is to better identify expression changes between different biological or disease conditions. However, existing methods for detecting differential expression from RNA-Seq count data have not been comprehensively evaluated in large-scale RNA-Seq datasets. Many of them suffer from inflation of type I error and failure in controlling false discovery rate especially in the presence of abnormal high sequence read counts in RNA-Seq experiments.RESULTS:To address these challenges, we propose a powerful and robust tool, termed deGPS, for detecting differential expression in RNA-Seq data. This framework contains new normalization methods based on generalized Poisson distribution modeling sequence count data, followed by permutation-based differential expression tests. We systematically evaluated our new tool in simulated datasets from several large-scale TCGA RNA-Seq projects, unbiased benchmark data from compcodeR package, and real RNA-Seq data from the development transcriptome of Drosophila. deGPS can precisely control type I error and false discovery rate for the detection of differential expression and is robust in the presence of abnormal high sequence read counts in RNA-Seq experiments.CONCLUSIONS:Software implementing our deGPS was released within an R package with parallel computations ( https://github.com/LL-LAB-MCW/deGPS ). deGPS is a powerful and robust tool for data normalization and detecting different expression in RNA-Seq experiments. Beyond RNA-Seq, deGPS has the potential to significantly enhance future data analysis efforts from many other high-throughput platforms such as ChIP-Seq, MBD-Seq and RIP-Seq.

PDEGEM: Modeling non-uniform read distribution in RNA-Seq data

Degps is a Powerful Tool for Detecting Differential Expression in RNA-sequencing Studies

RNAflow: An Effective and Simple RNA-Seq Differential Gene Expression Pipeline Using Nextflow

A Unified Model for Joint Normalization and Differential Gene Expression Detection in RNA-Seq Data.

A Unified Model for Differential Expression Analysis of RNA-seq Data Via L1-Penalized Linear Regression

PennSeq: Accurate Isoform-Specific Gene Expression Quantification in RNA-Seq by Modeling Non-Uniform Read Distribution

Modeling RNA Degradation for RNA-Seq with Applications

Modeling expression ranks for noise-tolerant differential expression analysis of scRNA-seq data

Differential RNA Methylation Analysis for MeRIP-seq Data under General Experimental Design

Modeling non-uniformity in short-read rates in RNA-Seq data

A two-step strategy for detecting differential gene expression in cDNA microarray data

Using non-uniform read distribution models to improve isoform expression inference in RNA-Seq.

Dynamic Model for RNA-seq Data Analysis

PLNseq: a multivariate Poisson lognormal distribution for high-throughput matched RNA-sequencing read count data.

Estimation of Isoform Expression in Rna-Seq Data Using A Hierarchical Bayesian Model

EBSeq-HMM: a Bayesian Approach for Identifying Gene-Expression Changes in Ordered RNA-seq Experiments

Detecting Differentially Expressed Genes by Smoothing Effect of Gene Length on Variance Estimation

DEGseq: an R package for identifying differentially expressed genes from RNA-seq data

Unit-Free and Robust Detection of Differential Expression from RNA-Seq Data

Joint Estimation of Isoform Expression and Isoform-Specific Read Distribution Using Multisample RNA-Seq Data.

Statistical Modeling of RNA-Seq Data