Integrating single-cell datasets with ambiguous batch information by incorporating molecular network features
Ji Dong,Peijie Zhou,Yichong Wu,Yidong Chen,Haoling Xie,Yuan Gao,Jiansen Lu,Jingwei Yang,Xiannian Zhang,Lu Wen,Tiejun Li,Fuchou Tang
DOI: https://doi.org/10.1093/bib/bbab366
IF: 9.5
2021-09-21
Briefings in Bioinformatics
Abstract:Abstract With the rapid development of single-cell sequencing techniques, several large-scale cell atlas projects have been launched across the world. However, it is still challenging to integrate single-cell RNA-seq (scRNA-seq) datasets with diverse tissue sources, developmental stages and/or few overlaps, due to the ambiguity in determining the batch information, which is particularly important for current batch-effect correction methods. Here, we present SCORE, a simple network-based integration methodology, which incorporates curated molecular network features to infer cellular states and generate a unified workflow for integrating scRNA-seq datasets. Validating on real single-cell datasets, we showed that regardless of batch information, SCORE outperforms existing methods in accuracy, robustness, scalability and data integration.
biochemical research methods,mathematical & computational biology