Distribution-Independent Cell Type Identification for Single-Cell RNA-seq Data

Yuyao Zhai,Liang Chen,Minghua Deng
DOI: https://doi.org/10.24963/ijcai.2024/679
2024-01-01
Abstract:Automatic cell type annotation aims to transfer the label knowledge from label-abundant reference data to label-scarce target data, which makes encouraging progress in single-cell RNA-seq data analysis. While previous works have focused on classifying close-set cells and detecting open-set cells during testing, it is still essential to be able to classify unknown cell types as human beings. Additionally, few efforts have been devoted to addressing the challenge of common long-tail dilemma in cell type annotation data. Therefore, in this paper, we propose an innovative distribution-independent universal cell type identification framework called scDET from the perspective of autonomously equilibrated dual-consultative contrastive learning. Our model can generate fine-grained predictions for both close-set and open-set cell types in a long-tailed open-world environment. scDET consists of a contrastive-learning branch and a pseudo-labeling branch, which work collaboratively to provide interactive supervision. Specifically, the contrastive-learning branch provides reliable distribution estimation to regularize the predictions of the pseudo-labeling branch, which in turn guides itself through self-balanced knowledge transfer and a designed novel soft contrastive loss. Extensive experimental results on various evaluation datasets demonstrate the superior performance of scDET over other state-of-the-art single-cell clustering and annotation methods.
What problem does this paper attempt to address?