Explainable deep neural networks for predicting sample phenotypes from single-cell transcriptomics
Jordi Martorell-Marugán,Raúl López-Domínguez,Juan Antonio Villatoro-García,Daniel Toro-Domínguez,Marco Chierici,Giuseppe Jurman,Pedro Carmona-Sáez
DOI: https://doi.org/10.1101/2024.12.03.626549
2024-12-06
Abstract:Recent advances in single-cell RNA-Seq (scRNA-Seq) technologies have revolutionized our ability to gather molecular insights into different phenotypes, such as diseases, at the level of individual cells. The analysis of the resulting data poses significant challenges due to their sparsity and large volume, and proper statistical methods are required to analyze and extract information from scRNA-Seq datasets. Sample classification based on gene expression data has proven effective and valuable for precision medicine applications. However, standard classification schemas are often not suitable for scRNA-Seq due to its unique characteristics, and new algorithms are required to effectively analyze and classify samples at the single-cell level. In this article, we introduce singleDeep, an end-to-end pipeline that streamlines the analysis of scRNA-Seq data training deep neural networks, enabling robust prediction and characterization of sample phenotypes. To validate the effectiveness of singleDeep, we applied it to make predictions on scRNA-Seq datasets from different conditions, including systemic lupus erythematosus and Alzheimer's disease. Our results demonstrate strong diagnostic performance, validated both internally and externally. Moreover, compared with traditional machine learning methods applied to pseudobulk data, singleDeep consistently outperformed these approaches. In addition to prediction accuracy, singleDeep provides valuable insights into cell types and gene importance estimation for phenotypic characterization. This functionality provided additional and valuable information in our use cases. For instance, we corroborated that some interferon signature genes are consistently relevant for autoimmunity across all immune cell types in lupus. On the other hand, we discovered that genes linked to dementia have relevant roles in specific brain cell populations, such as APOE in astrocytes.
Biology