An Ultra-Scalable Ensemble Clustering Method for Cell Type Recognition Based on Scrna-Seq Data of Alzheimer's Disease

Linfeng Hu,Juan Zhou,Yangping Qiu,Xiong Li
DOI: https://doi.org/10.1145/3544109.3544160
2022-01-01
Abstract:Aiming at the clustering problem in single cells, considering the high dimensionality and sparsity of the data, we propose to apply an ultra-scalable ensemble clustering (U-SENC) algorithm to single-cell clustering. The algorithm is composed of two phases: in the initial phase, in order to ensure the high efficiency of sample random selection while maintaining the availability of k-means selection of sample representatives, a hybrid sample representative selection strategy is introduced; in the second phase, the K nearest representatives of any data object in the dataset are efficiently approximated by a rough-to-fine method, with a sparse affinity submatrix constructed between these objects and representatives. Then, the affinity submatrix is transformed into a bipartite graph, and the graph is effectively segmented by transfer cutting to achieve the clustering result. Finally, U-SENC integrates the previous multiple ultra-scalable spectral clustering (U-SPEC) to improve the robustness of the U-SPEC algorithm as well as keeping high effectiveness. The test results demonstrate that the clustering algorithm raised in this paper could better identify cell types from the aspect of clustering evaluation index normalized mutual information (NMI).
What problem does this paper attempt to address?