SCOTCluster: Deep Clustering with Optimal Transport for Large-scale Single-cell RNA-seq Data

Faning Long,Xiaojun Ding,Xiaoqing Peng,Jianxin Wang,Xiaoshu Zhu
DOI: https://doi.org/10.1109/bibm52615.2021.9669800
2021-01-01
Abstract:Single-cell RNA sequencing (scRNA-seq) presents cell heterogeneity in a high resolution to explore cell development. The high dimension and the high noise in scRNAseq data bring some computational challenges. By introducing optimal transmission regularization, we proposed a novel deep clustering method, called SCOTCluster, which accurately learned low-dimensional representations. In SCOTCluster, a joint training strategy was designed by integrating AutoEncoder and soft k-means. Notably, to improve simultaneously the accuracy and robustness, the optimal transmission was introduced in the objective function of soft k-means, and entropy regularization and Sinkhorn iterative algorithm were performed to constraint the cluster size. To test the performance, we compared SCOTCluster with five state-of the-art methods on 16 real large-scale scRNA-seq datasets. The experimental results showed that SCOTCluster improved the training stability and clustering performance.
What problem does this paper attempt to address?