A Hybrid Clustering Algorithm for Identifying Cell Types from Single-Cell RNA-Seq Data.

Xiaoshu Zhu,Hong-Dong Li,Yunpei Xu,Lilu Guo,Fang-Xiang Wu,Guihua Duan,Jianxin Wang
DOI: https://doi.org/10.3390/genes10020098
IF: 4.141
2019-01-01
Genes
Abstract:Single-cell RNA sequencing (scRNA-seq) has recently brought new insight into cell differentiation processes and functional variation in cell subtypes from homogeneous cell populations. A lack of prior knowledge makes unsupervised machine learning methods, such as clustering, suitable for analyzing scRNA-seq . However, there are several limitations to overcome, including high dimensionality, clustering result instability, and parameter adjustment complexity. In this study, we propose a method by combining structure entropy and k nearest neighbor to identify cell subpopulations in scRNA-seq data. In contrast to existing clustering methods for identifying cell subtypes, minimized structure entropy results in natural communities without specifying the number of clusters. To investigate the performance of our model, we applied it to eight scRNA-seq datasets and compared our method with three existing methods (nonnegative matrix factorization, single-cell interpretation via multikernel learning, and structural entropy minimization principle). The experimental results showed that our approach achieves, on average, better performance in these datasets compared to the benchmark methods.
What problem does this paper attempt to address?