Deep single-cell RNA-seq data clustering with graph prototypical contrastive learning

Junseok Lee,Sungwon Kim,Dongmin Hyun,Namkyeong Lee,Yejin Kim,Chanyoung Park
DOI: https://doi.org/10.1093/bioinformatics/btad342
IF: 5.8
2023-06-01
Bioinformatics
Abstract:Motivation: Single-cell RNA sequencing enables researchers to study cellular heterogeneity at single-cell level. To this end, identifying cell types of cells with clustering techniques becomes an important task for downstream analysis. However, challenges of scRNA-seq data such as pervasive dropout phenomena hinder obtaining robust clustering outputs. Although existing studies try to alleviate these problems, they fall short of fully leveraging the relationship information and mainly rely on reconstruction-based losses that highly depend on the data quality, which is sometimes noisy. Results: This work proposes a graph-based prototypical contrastive learning method, named scGPCL. Specifically, scGPCL encodes the cell representations using Graph Neural Networks on cell-gene graph that captures the relational information inherent in scRNA-seq data and introduces prototypical contrastive learning to learn cell representations by pushing apart semantically dissimilar pairs and pulling together similar ones. Through extensive experiments on both simulated and real scRNA-seq data, we demonstrate the effectiveness and efficiency of scGPCL. Availability and implementation: Code is available at https://github.com/Junseok0207/scGPCL.
What problem does this paper attempt to address?