Beyond benchmarking and towards predictive models of dataset-specific single-cell RNA-seq pipeline performance

Cindy Fang,Alina Selega,Kieran R. Campbell
DOI: https://doi.org/10.1186/s13059-024-03304-9
IF: 17.906
2024-06-20
Genome Biology
Abstract:The advent of single-cell RNA-sequencing (scRNA-seq) has driven significant computational methods development for all steps in the scRNA-seq data analysis pipeline, including filtering, normalization, and clustering. The large number of methods and their resulting parameter combinations has created a combinatorial set of possible pipelines to analyze scRNA-seq data, which leads to the obvious question: which is best? Several benchmarking studies compare methods but frequently find variable performance depending on dataset and pipeline characteristics. Alternatively, the large number of scRNA-seq datasets along with advances in supervised machine learning raise a tantalizing possibility: could the optimal pipeline be predicted for a given dataset?
biotechnology & applied microbiology,genetics & heredity
What problem does this paper attempt to address?