Abstract:Motivation: Tests of differentially expressed genes (DEGs) from microarray experiments are based on the null hypothesis that genes that are irrelevant to the phenotype/stimulus are expressed equally in the target and control samples. However, this strict hypothesis is not always true, as there can be several transcriptomic background differences between target and control samples, including different cell/tissue types, different cell cycle stages and different biological donors. These differences lead to increased false positives, which have little biological/medical significance.Result: In this article, we propose a statistical framework to identify DEGs between target and control samples from expression microarray data allowing transcriptomic background differences between these samples by introducing a modified null hypothesis that the gene expression background difference is normally distributed. We use an iterative procedure to perform robust estimation of the null hypothesis and identify DEGs as outliers. We evaluated our method using our own triplicate microarray experiment, followed by validations with reverse transcription-polymerase chain reaction (RT-PCR) and on the MicroArray Quality Control dataset. The evaluations suggest that our technique (i) results in less false positive and false negative results, as measured by the degree of agreement with RT-PCR of the same samples, (ii) can be applied to different microarray platforms and results in better reproducibility as measured by the degree of DEG identification concordance both intra-and inter-platforms and (iii) can be applied efficiently with only a few microarray replicates. Based on these evaluations, we propose that this method not only identifies more reliable and biologically/medically significant DEG, but also reduces the power-cost tradeoff problem in the microarray field.

Bayesian inference with historical data-based informative priors improves detection of differentially expressed genes.

A Bayesian model averaging approach for observational gene expression studies

Fully Bayesian imputation model for non-random missing data in qPCR

A Robust Empirical Bayesian Method For Detecting Differentially Expressed Genes

EBSeq-HMM: a Bayesian Approach for Identifying Gene-Expression Changes in Ordered RNA-seq Experiments

A two-step strategy for detecting differential gene expression in cDNA microarray data

A two-sample Bayesian t-test for microarray data

Smaller $p$-values in genomics studies using distilled historical information

Exploratory data analysis for large-scale multiple testing problems and its application in gene expression studies

Bayesian Nonparametric Variable Selection as an Exploratory Tool for Finding Genes that Matter

Novel Statistical Framework to Identify Differentially Expressed Genes Allowing Transcriptomic Background Differences

Deep historical borrowing framework to prospectively and simultaneously synthesize control information in confirmatory clinical trials with multiple endpoints

Detecting Disease-Associated Genomic Outcomes Using Constrained Mixture of Bayesian Hierarchical Models for Paired Data.

Non-parametric Bayesian modelling of digital gene expression data

Statistical inference for time course RNA-Seq data using a negative binomial mixed-effect model

Bayesian Inference of Genetic Regulatory Networks from Time Series Microarray Data Using Dynamic Bayesian Networks

Bayesian Gene Selection Based on Pathway Information and Network-Constrained Regularization

Bayesian Gene Set Analysis

A Bayesian model selection approach for identifying differentially expressed transcripts from RNA-Seq data

Bayesian Approaches for Large Biological Networks

A scaled kernel density estimation prior for dynamic borrowing of historical information with application to clinical trial design