A knockoff calibration method to avoid over-clustering in single-cell RNA-sequencing

Alan DenAdel,Michelle L. Ramseier,Andrew W. Navia,Alex K. Shalek,Srivatsan Raghavan,Peter S. Winter,Ava P. Amini,Lorin Crawford
DOI: https://doi.org/10.1101/2024.03.08.584180
2024-03-13
Abstract:Standard single-cell RNA-sequencing (scRNA-seq) pipelines nearly always include unsupervised clustering as a key step in identifying biologically distinct cell types. A follow-up step in these pipelines is to test for differential expression between the identified clusters. When algorithms over-cluster, downstream analyses will produce inflated -values resulting in increased false discoveries. In this work, we present ( i r ted lustering via nockoffs): a new method for protecting against over-clustering by controlling for the impact of reusing the same data twice when performing differential expression analysis, commonly known as “double-dipping”. Importantly, our approach can be applied to a wide range of clustering algorithms. Using real and simulated data, we show that provides state-of-the-art clustering performance and can rapidly analyze large-scale scRNA-seq studies, even on a personal laptop.
Bioinformatics
What problem does this paper attempt to address?