SELINA: Single-Cell Assignment Using Multiple-Adversarial Domain Adaptation Network with Large-Scale References

Pengfei Ren,Xiaoying Shi,Zhiguang Yu,Xin Dong,Xuanxin Ding,Jin Wang,Liangdong Sun,Yilv Yan,Junjie Hu,Peng Zhang,Qianming Chen,Jing Zhang,Taiwen Li,Chenfei Wang
DOI: https://doi.org/10.1016/j.crmeth.2023.100577
2022-01-01
SSRN Electronic Journal
Abstract:The rapid accumulation of single-cell RNA-seq data has provided rich resources to characterize various human cell types. Cell type annotation is the critical step in analyzing single-cell RNA-seq data. However, accurate cell type annotation based on public references is challenging due to the inconsistent annotations, batch effects, and poor characterization of rare cell types. Here, we introduce SELINA (single cELl identity NAvigator), an integrative annotation transferring framework for automatic cell type annotation. SELINA optimizes the annotation for minority cell types by synthetic minority over-sampling, removes batch effects among reference datasets using a multiple-adversarial domain adaptation network (MADA), and fits the query data with reference data using an autoencoder. Finally, SELINA affords a comprehensive and uniform reference atlas with 1.7 million cells covering 230 major human cell types. We demonstrated the robustness and superiority of SELINA in most human tissues compared to existing methods. SELINA provided a one-stop solution for human single- cell RNA-seq data annotation with the potential to extend for other species.
What problem does this paper attempt to address?