Identification of cell-type-specific genes in multimodal single-cell data using deep neural network algorithm

Weiye Qian,Zhiyuan Yang
DOI: https://doi.org/10.1016/j.compbiomed.2023.107498
Abstract:The emergence of single-cell RNA sequencing (scRNA-seq) technology makes it possible to measure DNA, RNA, and protein in a single cell. Cellular Indexing of Transcriptomes and Epitopes by sequencing (CITE-seq) is a powerful multimodal single-cell research innovation, allowing researchers to capture RNA and surface protein expression on the same cells. Currently, identification of cell-type-specific genes in CITE-seq data is still challenging. In this study, we obtained a set of CITE-seq datasets from Kaggle database, which included the sequencing dataset of seven cell types during bone marrow stem cell differentiation. We used Student's t-test to analyze these transcription RNAs and pick out 133 significantly differentially expressed genes (DEGs) among all cell types. Functional enrichment revealed that these DEGs were strongly associated with blood-related diseases, providing important insights into the cellular heterogeneity within bone marrow stem cells. The relation between RNA and protein levels was performed by deep neural network (DNN) model and achieved a high prediction score of 0.867. Based on their coefficients in the DNN model, three genes (LGALS1, CENPV, TRIM24) were identified as cell-type-specific genes in erythrocyte progenitor. Our works provide a novel perspective regarding the differentiation of stem cells in the bone marrow and provide valuable insights for further research in this field.
What problem does this paper attempt to address?