TPK: a single-cell clustering algorithm based on novel feature selection genes

Yaxuan Cui,Kunjie Luo,Zheyu Zhang,Saijia Liu
DOI: https://doi.org/10.1088/1742-6596/1738/1/012078
2021-01-01
Journal of Physics: Conference Series
Abstract:Abstract With the continuous development of single-cell sequencing technology, through the gene expression data obtained by single-cell sequencing technology, we can have a deeper understanding of the heterogeneity between cells and the underlying mechanisms that exist between cells. However, due to the complexity of the data, single-cell identification and clustering have also brought us huge challenges. We found that many classic clustering algorithms performed poorly in single-cell clustering. Our research found that the key reason was that no mark was found. gene. First remove genes with low expression levels, and then calculate the variance value of genes, select the top 1000 genes with the largest variance, and then perform a T test to remove noise. Finally, the obtained genes are clustered using Cosine similarity algorithm and k-means. Found that it has a good clustering performance.
What problem does this paper attempt to address?