ClusterMatch aligns single-cell RNA-sequencing data at the multi-scale cluster level via stable matching

Teer Ba,Hao Miao,Lirong Zhang,Caixia Gao,Yong Wang
DOI: https://doi.org/10.1093/bioinformatics/btae480
IF: 5.8
2024-07-29
Bioinformatics
Abstract:Abstract Motivation Unsupervised clustering of single-cell RNA sequencing (scRNA-seq) data holds the promise of characterizing known and novel cell type in various biological and clinical contexts. However, intrinsic multi-scale clustering resolutions poses challenges to deal with multiple sources of variability in the high-dimensional and noisy data. Results We present ClusterMatch, a stable match optimization model to align scRNA-seq data at the cluster level. In one hand, ClusterMatch leverages the mutual correspondence by canonical correlation analysis (CCA) and multi-scale Louvain clustering algorithms to identify cluster with optimized resolutions. In the other hand it utilizes stable matching framework to align scRNA-seq data in the latent space while maintaining interpretability with overlapped marker gene set. Through extensive experiments, we demonstrate the efficacy of ClusterMatch in data integration, cell type annotation, and cross-species/timepoint alignment scenarios. Our results show ClusterMatch's ability to utilize both global and local information of scRNA-seq data, sets the appropriate resolution of multi-scale clustering, and offers interpretability by utilizing marker genes. Availability The code of CusterMatch software is freely available at https://github.com/AMSSwanglab/ClusterMatch. Supplementary information Supplementary data are available at Bioinformatics online.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?