SCIntRuler: guiding the integration of multiple single-cell RNA-seq datasets with a novel statistical metric

Yue Lyu,Steven H Lin,Hao Wu,Ziyi Li
DOI: https://doi.org/10.1093/bioinformatics/btae537
IF: 5.8
2024-09-02
Bioinformatics
Abstract:Motivation: The growing number of single-cell RNA-seq (scRNA-seq) studies highlights the potential benefits of integrating multiple datasets, such as augmenting sample sizes and enhancing analytical robustness. Inherent diversity and batch discrepancies within samples or across studies continue to pose significant challenges for computational analyses. Questions persist in practice, lacking definitive answers: Should we use a specific integration method or opt for simply merging the datasets during joint analysis? Among all the existing data integration methods, which one is more suitable in specific scenarios? Result: To fill the gap, we introduce SCIntRuler, a novel statistical metric for guiding the integration of multiple scRNA-seq datasets. SCIntRuler helps researchers make informed decisions regarding the necessity of data integration and the selection of an appropriate integration method. Our simulations and real data applications demonstrate that SCIntRuler streamlines decision-making processes and facilitates the analysis of diverse scRNA-seq datasets under varying contexts, thereby alleviating the complexities associated with the integration of heterogeneous scRNA-seq datasets. Availability and implementation: The implementation of our method is available on CRAN as an open-source R package with a user-friendly manual available: https://cloud.r-project.org/web/packages/SCIntRuler/index.html.
What problem does this paper attempt to address?