Detecting Differentially Expressed Genes by Smoothing Effect of Gene Length on Variance Estimation

Jinyang Tang,Fei Wang
DOI: https://doi.org/10.1142/s0219720015420044
2015-01-01
Journal of Bioinformatics and Computational Biology
Abstract:Next-generation sequencing technologies are widely used in genome research, and RNA sequencing (RNA-Seq) is becoming the main application for gene expression profiling. A large number of computational methods have been developed for analyzing differentially expressed (DE) genes in RNA-Seq data. However, most existing algorithms prefer to call long genes as DE. Short DE genes are rarely detected. In this work, we set out to gain insight into the influence of gene length on RNA-Seq data analysis and to figure out the effect of gene length on variance estimation of RNA-Seq read counts, which is important for statistic test to identify DE genes. We proposed a balanced method of hunting for short DE genes with significance by smoothing a gene length factor. Computational experiments indicate that our method performs well. Software available: http://www.iipl.fudan.edu.cn/lenseq/.
What problem does this paper attempt to address?