COMSE: analysis of single-cell RNA-seq data using community detection-based feature selection

Qinhuan Luo,Yaozhu Chen,Xun Lan
DOI: https://doi.org/10.1186/s12915-024-01963-5
2024-08-07
Abstract:Background: Single-cell RNA sequencing enables studying cells individually, yet high gene dimensions and low cell numbers challenge analysis. And only a subset of the genes detected are involved in the biological processes underlying cell-type specific functions. Result: In this study, we present COMSE, an unsupervised feature selection framework using community detection to capture informative genes from scRNA-seq data. COMSE identified homogenous cell substates with high resolution, as demonstrated by distinguishing different cell cycle stages. Evaluations based on real and simulated scRNA-seq datasets showed COMSE outperformed methods even with high dropout rates in cell clustering assignment. We also demonstrate that by identifying communities of genes associated with batch effects, COMSE parses signals reflecting biological difference from noise arising due to differences in sequencing protocols, thereby enabling integrated analysis of scRNA-seq datasets of different sources. Conclusions: COMSE provides an efficient unsupervised framework that selects highly informative genes in scRNA-seq data improving cell sub-states identification and cell clustering. It identifies gene subsets that reveal biological and technical heterogeneity, supporting applications like batch effect correction and pathway analysis. It also provides robust results for bulk RNA-seq data analysis.
What problem does this paper attempt to address?