scCrab: A Reference-Guided Cancer Cell Identification Method based on Bayesian Neural Networks

Heyang Hua,Wenxin Long,Yan Pan,Siyu Li,Jianyu Zhou,Haixin Wang,Shengquan Chen
DOI: https://doi.org/10.1007/s12539-024-00655-6
2024-10-01
Interdisciplinary Sciences Computational Life Sciences
Abstract:Cancer is a significant global public health concern, where early detection can greatly enhance curative outcomes. Therefore, the identification of cancer cells holds significant importance as the primary method for cancer diagnosis. The advancement of single-cell RNA sequencing (scRNA-seq) technology has made it possible to address the problem of cancer cell identification at the single-cell level more efficiently with computational methods, as opposed to the time-consuming and less reproducible manual identification methods. However, existing computational methods have shown suboptimal identification performance and a lack of capability to incorporate external reference data as prior information. Here, we propose scCrab, a reference-guided automatic cancer cell identification method, which performs ensemble learning based on a Bayesian neural network (BNN) with multi-head self-attention mechanisms and a linear regression model. Through a series of experiments on various datasets, we systematically validated the superior performance of scCrab in both intra- and inter-dataset predictions. Besides, we demonstrated the robustness of scCrab to dropout rate and sample size, and conducted ablation experiments to investigate the contributions of each component in scCrab. Furthermore, as a dedicated model for cancer cell identification, scCrab effectively captures cancer-related biological significance during the identification process.
mathematical & computational biology
What problem does this paper attempt to address?