CLUEY enables knowledge-guided clustering and cell type detection from single-cell omics data

Daniel Kim,Carissa Chen,Lijia Yu,Jean Yee Hwa Yang,Pengyi Yang
DOI: https://doi.org/10.1101/2024.11.14.623697
2024-11-15
Abstract:Clustering is a fundamental task in single-cell omics data analysis and can significantly impact downstream analyses and biological interpretations. The standard approach involves grouping cells based on their gene expression profiles, followed by annotating each cluster to a cell type using marker genes. However, the number of cell types detected by different clustering methods can vary substantially due to several factors, including the dimension reduction method used and the choice of parameters of the chosen clustering algorithm. These discrepancies can lead to subjective interpretations in downstream analyses, particularly in manual cell type annotation. To address these challenges, we propose CLUEY, a knowledge-guided framework for cell type detection and clustering of single-cell omics data. CLUEY integrates prior biological knowledge into the clustering process, providing guidance on the optimal number of clusters and enhancing the interpretability of results. We apply CLUEY to both unimodal (e.g. scRNA-seq, scATAC-seq) and multimodal datasets (e.g. CITE-seq, SHARE-seq) and demonstrate its effectiveness in providing biologically meaningful clustering outcomes. These results highlight CLUEY on providing the much-needed guidance in clustering analyses of single-cell omics data. CLUEY package is available from https://github.com/SydneyBioX/CLUEY.
Bioinformatics
What problem does this paper attempt to address?