scMerge: Integration of multiple single-cell transcriptomics datasets leveraging stable expression and pseudo-replication

Yingxin Lin,Shila Ghazanfar,Kevin Wang,Johann A. Gagnon-Bartsch,Kitty K. Lo,Xianbin Su,Ze-Guang Han,John T. Ormerod,Terence P. Speed,Pengyi Yang,Jean Yee Hwa Yang
DOI: https://doi.org/10.1101/393280
IF: 11.1
2018-01-01
Proceedings of the National Academy of Sciences
Abstract:Concerted examination of multiple collections of single cell RNA-Seq (scRNA-Seq) data promises further biological insights that cannot be uncovered with individual datasets. However, such integrative analyses are challenging and require sophisticated methodologies. To enable effective interrogation of multiple scRNA-Seq datasets, we have developed a novel algorithm, named scMerge, that removes unwanted variation by combining stably expressed genes and utilizing pseudo-replicates across datasets. Analysis of large collections of publicly available datasets demonstrates that scMerge performs well in multiple scenarios and enhances biological discovery, including inferring cell developmental trajectories.
What problem does this paper attempt to address?