Single-Cell Clustering Based on Shared Nearest Neighbor and Graph Partitioning

Xiaoshu Zhu,Jie Zhang,Yunpei Xu,Jianxin Wang,Xiaoqing Peng,Hong-Dong Li
DOI: https://doi.org/10.1007/s12539-019-00357-4
2020-01-01
Interdisciplinary Sciences Computational Life Sciences
Abstract:Clustering of single-cell RNA sequencing (scRNA-seq) data enables discovering cell subtypes, which is helpful for understanding and analyzing the processes of diseases. Determining the weight of edges is an essential component in graph-based clustering methods. While several graph-based clustering algorithms for scRNA-seq data have been proposed, they are generally based on k-nearest neighbor (KNN) and shared nearest neighbor (SNN) without considering the structure information of graph. Here, to improve the clustering accuracy, we present a novel method for single-cell clustering, called structural shared nearest neighbor-Louvain (SSNN-Louvain), which integrates the structure information of graph and module detection. In SSNN-Louvain, based on the distance between a node and its shared nearest neighbors, the weight of edge is defined by introducing the ratio of the number of the shared nearest neighbors to that of nearest neighbors, thus integrating structure information of the graph. Then, a modified Louvain community detection algorithm is proposed and applied to identify modules in the graph. Essentially, each community represents a subtype of cells. It is worth mentioning that our proposed method integrates the advantages of both SNN graph and community detection without the need for tuning any additional parameter other than the number of neighbors. To test the performance of SSNN-Louvain, we compare it to five existing methods on 16 real datasets, including nonnegative matrix factorization, single-cell interpretation via multi-kernel learning, SNN-Cliq, Seurat and PhenoGraph. The experimental results show that our approach achieves the best average performance in these datasets.
What problem does this paper attempt to address?