scDesign2: a transparent simulator that generates high-fidelity single-cell gene expression count data with gene correlations captured

Tianyi Sun,Dongyuan Song,Wei Vivian Li,Jingyi Jessica Li
DOI: https://doi.org/10.1186/s13059-021-02367-2
IF: 17.906
2021-05-25
Genome Biology
Abstract:Abstract A pressing challenge in single-cell transcriptomics is to benchmark experimental protocols and computational methods. A solution is to use computational simulators, but existing simulators cannot simultaneously achieve three goals: preserving genes, capturing gene correlations, and generating any number of cells with varying sequencing depths. To fill this gap, we propose scDesign2, a transparent simulator that achieves all three goals and generates high-fidelity synthetic data for multiple single-cell gene expression count-based technologies. In particular, scDesign2 is advantageous in its transparent use of probabilistic models and its ability to capture gene correlations via copulas.
genetics & heredity,biotechnology & applied microbiology
What problem does this paper attempt to address?