Zero-preserving Imputation of Scrna-Seq Data Using Low-Rank Approximation
George C. Linderman,Jun Zhao,Yuval Kluger
DOI: https://doi.org/10.1101/397588
2018-01-01
Abstract:Single cell RNA-sequencing (scRNA-seq) methods have revolutionized the study of gene expression but are plagued by dropout events, a phenomenon where genes actually expressed in a given cell are incorrectly measured as unexpressed. We present a method based on low-rank approximation which successfully replaces these dropouts (zero expression levels of unobserved expressed genes) by nonzero values, while preserving biologically non-expressed genes (true biological zeros) at zero expression levels. We validate our approach and compare it to two state-of-the-art methods. We show that it recovers true expression of marker genes while preserving biological zeros, increases separation of known cell types and improves correlation of simulated cells to their true profiles. Furthermore, our method is dramatically more scalable, allowing practitioners to quickly and easily recover expression of even the largest scRNA-seq datasets.
What problem does this paper attempt to address?