Abstract:Accounting for batch effects, especially latent batch effects, in differential expression (DE) analysis is critical for identifying true biological effects. Single-cell RNA sequencing (scRNA-seq) is a powerful tool for quantifying cell-to-cell variation in transcript abundance and characterizing cellular dynamics. Although many scRNA-seq DE analysis methods accommodate known batch variables, their performance has not been systematically evaluated. Moreover, the challenge of accounting for latent batch variables in scRNA-seq DE analysis is largely unmet. In contrast, many methods have been developed to account for batch variables (either known or latent) in other high-dimensional data, especially bulk RNA-seq. We extensively evaluate eleven methods for batch variables in different scRNA-seq DE analysis scenarios, with a primary focus on latent batch variables. We demonstrate that for known batch variables, incorporating them as covariates into a regression model outperformed approaches using batch-corrected matrix. For latent batches, fixed effects models have inflated FDRs, whereas aggregation-based methods and mixed effects models have significant power loss. Surrogate variable based methods generally control the FDR well while achieving good power with small group effects. However, their performance (except SVA) deteriorated substantially in scenarios involving large group effects and/or group label impurity. In these settings, SVA achieves relatively good performance despite occasionally inflated FDR (up to 0.2). Finally we make following recommendations for scRNA-seq DE analysis: 1) incorporating known batch variables instead of using batch-corrected data; 2) employing SVA for latent batch correction and 3) better methods are still needed to fully unleash the power of scRNA-seq.<ol class="links-for-figure"><li><a class="anchor download-link u-font-sans" href="https://ars.els-cdn.com/content/image/1-s2.0-S200103701930409X-ga1_lrg.jpg">Download : Download high-res image (88KB)</a></li><li><a class="anchor download-link u-font-sans" href="https://ars.els-cdn.com/content/image/1-s2.0-S200103701930409X-ga1.jpg">Download : Download full-size image</a></li></ol>

ComBat-seq: batch effect adjustment for RNA-seq count data

Highly Effective Batch Effect Correction Method for RNA-seq Count Data

Assessment of batch-correction methods for scRNA-seq data with a new test metric

Removing batch effects from purified plasma cell gene expression microarrays with modified ComBat

A comparison of methods accounting for batch effects in differential expression analysis of UMI count based single cell RNA sequencing

A test metric for assessing single-cell RNA-seq batch correction

Removing Batch Effects in Analysis of Expression Microarray Data: an Evaluation of Six Batch Adjustment Methods

A benchmark of batch-effect correction methods for single-cell RNA sequencing data

reComBat: batch-effect removal in large-scale multi-source gene-expression data integration

SSBER: removing batch effect for single-cell RNA sequencing data

Case-specific selection of batch correction methods for integrating single-cell transcriptomic data from different sources

BatMan: Mitigating Batch Effects via Stratification for Survival Outcome Prediction

Influence of single-cell RNA sequencing data integration on the performance of differential gene expression analysis

Adjusting Batch Effects in Microarray Expression Data Using Empirical Bayes Methods.

Overcoming the impacts of two-step batch effect correction on gene expression estimation and inference

Comparison of Scanpy-based Algorithms to Remove the Batch Effect from Single-Cell RNA-seq Data

Batch correction methods used in single cell RNA-sequencing analyses are often poorly calibrated

Fast-scBatch: Batch Effect Correction Using Neural Network-Driven Distance Matrix Adjustment

Assessing and mitigating batch effects in large-scale omics studies

Evaluating batch correction methods for image-based cell profiling

Degps is a Powerful Tool for Detecting Differential Expression in RNA-sequencing Studies