Robust Differential Abundance Test in Compositional Data
Shulei Wang
DOI: https://doi.org/10.48550/arXiv.2101.08765
2022-04-13
Abstract:Differential abundance tests in compositional data are essential and fundamental tasks in various biomedical applications, such as single-cell, bulk RNA-seq, and microbiome data analysis. However, because of the compositional constraint and the prevalence of zero counts in the data, differential abundance analysis in compositional data remains a complicated and unsolved statistical problem. This study introduces a new differential abundance test, the robust differential abundance (RDB) test, to address these challenges. Compared with existing methods, the RDB test is simple and computationally efficient, is robust to prevalent zero counts in compositional datasets, can take the data's compositional nature into account, and has a theoretical guarantee of controlling false discoveries in a general setting. Furthermore, in the presence of observed covariates, the RDB test can work with the covariate balancing techniques to remove the potential confounding effects and draw reliable conclusions. Finally, we apply the new test to several numerical examples using simulated and real datasets to demonstrate its practical merits.
Methodology,Statistics Theory,Quantitative Methods,Applications