Using Multi-Encoder Semi-Implicit Graph Variational Autoencoder to Analyze Single-Cell RNA Sequencing Data

Shengwen Tian,Cunmei Ji,Jiancheng Ni,Yutian Wang,Chunhou Zheng
DOI: https://doi.org/10.1109/TCBB.2024.3458170
2024-09-10
Abstract:Rapid advances in single-cell RNA sequencing (scRNA-seq) have made it possible to characterize cell states at a high resolution view for large scale library. scRNA-seq data contains a great deal of biological information, which can be mainly used to discover cell subtypes and track cell development. However, traditional methods face many challenges in addressing scRNA-seq data with high dimensions and high sparsity. For better analysis of scRNA-seq data, we propose a new framework called MSVGAE based on variational graph auto-encoder and graph attention networks. Specifically, we introduce multiple encoders to learn features at different scales and control for uninformative features. Moreover, different noises are added to encoders to promote the propagation of graph structural information and distribution uncertainty. Therefore, some complex posterior distributions can be captured by our model. MSVGAE maps scRNA-seq data with high dimensions and high noise into the low-dimensional latent space, which is beneficial for downstream tasks. In particular, MSVGAE can handle extremely sparse data. Before the experiment, we create 24 simulated datasets to simulate various biological scenarios and collect 8 real-world datasets. The experimental results of clustering, visualization and marker genes analysis indicate that MSVGAE model has excellent accuracy and robustness in analyzing scRNA-seq data.
What problem does this paper attempt to address?