Unbiased integration of single cell multi-omics data

Jinzhuang Dou,Shaoheng Liang,Vakul Mohanty,Xuesen Cheng,Sangbae Kim,Jongsu Choi,Yumei Li,Katayoun Rezvani,Rui Chen,Ken Chen
DOI: https://doi.org/10.1101/2020.12.11.422014
2020-12-11
Abstract:Abstract Acquiring accurate single-cell multiomics profiles often requires performing unbiased in silico integration of data matrices generated by different single-cell technologies from the same biological sample. However, both the rows and the columns can represent different entities in different data matrices, making such integration a computational challenge that has only been solved approximately by existing approaches. Here, we present bindSC, a single-cell data integration tool that realizes simultaneous alignment of the rows and the columns between data matrices without making approximations. Using datasets produced by multiomics technologies as gold standard, we show that bindSC generates accurate multimodal co-embeddings that are substantially more accurate than those generated by existing approaches. Particularly, bindSC effectively integrated single cell RNA sequencing (scRNA-seq) and single cell chromatin accessibility sequencing (scATAC-seq) data towards discovering key regulatory elements in cancer cell-lines and mouse cells. It achieved accurate integration of both common and rare cell types (<0.25% abundance) in a novel mouse retina cell atlas generated using the 10x Genomics Multiome ATAC+RNA kit. Further, it achieves unbiased integration of scRNA-seq and 10x Visium spatial transcriptomics data derived from mouse brain cortex samples. Lastly, it demonstrated efficacy in delineating immune cell types via integrating single-cell RNA and protein data. Thus, bindSC, available at https://github.com/KChen-lab/bindSC , can be applied in a broad variety of context to accelerate discovery of complex cellular and biological identities and associated molecular underpinnings in diseases and developing organisms.
What problem does this paper attempt to address?