Reusability report: Leveraging supervised learning to uncover phenotype-relevant biology from single-cell RNA sequencing data

Yingying Cao,Tian-Gen Chang,Sahil Sahni,Eytan Ruppin
DOI: https://doi.org/10.1038/s42256-024-00804-y
IF: 23.8
2024-03-06
Nature Machine Intelligence
Abstract:Recent advances in single-cell transcriptome sequencing and computational analysis methods have improved our understanding of cellular heterogeneity. However, associating different cell subsets with phenotypes remains challenging. Recently, Ren et al. introduced PENCIL, a supervised learning framework incorporating gene selection to discern phenotype-relevant cells. To assess PENCIL's reproducibility and transferability, we conducted a comprehensive evaluation across 12 single-cell RNA sequencing datasets representing four distinct phenotypes. We identified a few caveats with the original version of PENCIL, such as sensitivity to input perturbation, the correction of which contributed to PENCIL's enhanced reproducibility. We highlight that boosting PENCIL's cell subsets identification with gene set variation analysis creates a cytotoxic T cell immunotherapy response signature (CyTIR) predictive of immune checkpoint blockade response in skin cancer across multiple datasets, with an area under curve >0.75 and accuracy >0.71. Overall, our assessments enhance PENCIL's reproducibility and utility, further extending its potential for identifying phenotype-relevant cell subsets in diverse biomedical applications.
computer science, artificial intelligence, interdisciplinary applications
What problem does this paper attempt to address?