Single-cell RNA-seq clustering: datasets, models, and algorithms

Lihong Peng,Xiongfei Tian,Geng Tian,Junlin Xu,Xin Huang,Yanbin Weng,Jialiang Yang,Liqian Zhou
DOI: https://doi.org/10.1080/15476286.2020.1728961
2020-03-01
RNA Biology
Abstract:Single-cell RNA sequencing (scRNA-seq) technologies allow numerous opportunities for revealing novel and potentially unexpected biological discoveries. scRNA-seq clustering helps elucidate cell-to-cell heterogeneity and uncover cell subgroups and cell dynamics at the group level. Two important aspects of scRNA-seq data analysis were introduced and discussed in the present review: relevant datasets and analytical tools. In particular, we reviewed popular scRNA-seq datasets and discussed scRNA-seq clustering models including K-means clustering, hierarchical clustering, consensus clustering, and so on. Seven state-of-the-art scRNA clustering methods were compared on five public available datasets. Two primary evaluation metrics, the Adjusted Rand Index (ARI) and the Normalized Mutual Information (NMI), were used to evaluate these methods. Although unsupervised models can effectively cluster scRNA-seq data, these methods also have challenges. Some suggestions were provided for future research directions.
biochemistry & molecular biology
What problem does this paper attempt to address?