GLMsim: a GLM-based single cell RNA-seq simulator incorporating batch and biological effects

Jianan Wang,Lizhong Chen,Rachel Thijssen,Belinda Phipson,Terence P. Speed
DOI: https://doi.org/10.1101/2024.03.20.586030
2024-03-23
Abstract:With development of the single cell RNA-seq technologies, large numbers of cells can now be routinely sequenced by different platforms. This requires us to choose an efficient integration tool to merge those cells, and computational simulators to help benchmark and assess the performance of these tools. Although existing single cell RNA-seq simulators can simulate library size, biological and batch effects separately, they currently do not capture associations among these three factors. Here we present GLMsim, the first single cell RNA-seq simulator to simultaneously capture the library size, biology and unwanted variation and their associations via a generalized linear model, and to simulate data resembling the original experimental data in these respects. GLMsim is capable of quantitatively benchmarking different single cell integration methods, and assessing their abilities to retain biology and remove library size and batch effects.
Bioinformatics
What problem does this paper attempt to address?