scDAC: deep adaptive clustering of single-cell transcriptomic data with coupled autoencoder and Dirichlet process mixture model

Sijing An,Jinhui Shi,Runyan Liu,Yaowen Chen,Jing Wang,Shuofeng Hu,Xinyu Xia,Guohua Dong,Xiaochen Bo,Zhen He,Xiaomin Ying
DOI: https://doi.org/10.1093/bioinformatics/btae198
IF: 5.8
2024-03-29
Bioinformatics
Abstract:Abstract Motivation Clustering analysis for single-cell RNA sequencing (scRNA-seq) data is an important step in revealing cellular heterogeneity. Many clustering methods have been proposed to discover heterogenous cell types from scRNA-seq data. However, adaptive clustering with accurate cluster number reflecting intrinsic biology nature from large-scale scRNA-seq data remains quite challenging. Results Here, we propose a single-cell Deep Adaptive Clustering (scDAC) model by coupling the Autoencoder (AE) and the Dirichlet Process Mixture Model (DPMM). By jointly optimizing the model parameters of AE and DPMM, scDAC achieves adaptive clustering with accurate cluster numbers on scRNA-seq data. We verify the performance of scDAC on five subsampled datasets with different numbers of cell types and compare it with 15 widely used clustering methods across nine scRNA-seq datasets. Our results demonstrate that scDAC can adaptively find accurate numbers of cell types or subtypes and outperforms other methods. Moreover, the performance of scDAC is robust to hyperparameter changes. Availability and implementation The scDAC is implemented in Python. The source code is available at https://github.com/labomics/scDAC.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?