VAPOR: Variational autoencoder with transport operators decouples co-occurring biological processes in development

Jie Sheng,Daifeng Wang
DOI: https://doi.org/10.1101/2024.10.27.620534
2024-10-29
Abstract:Background: Emerging single-cell and spatial transcriptomic data enable the investigation of gene expression dynamics of various biological processes, especially for development. To this end, existing computational methods typically infer trajectories that sequentially order cells for revealing gene expression changes in development, e.g., to assign a pseudotime to each cell indicating the ordering. However, these trajectories can aggregate different biological processes that cells undergo simultaneously such as maturation for specialized function and differentiation into specific cell types that do not occur on the same timescale. Therefore, a single pseudotime axis may not distinguish gene expression dynamics from co-occurring processes. Methods: We introduce a method, VAPOR (variational autoencoder with transport operators), to decouple dynamic patterns from developmental gene expression data. Particularly, VAPOR learns a latent space for gene expression dynamics and decomposes the space into multiple subspaces. The dynamics on each subspace are governed by an ordinary differential equation model, attempting to recapitulate specific biological processes. Furthermore, we can infer the process-specific pseudotimes, revealing multifaceted timescales of distinct processes in which cells may simultaneously be involved during development. Results: Initially tested on simulated datasets, VAPOR effectively recovered the topology and decoupled distinct dynamic patterns in the data. We then applied VAPOR to a developmental human brain scRNA-seq dataset across postconceptional weeks and identified gene expression dynamics for several key processes, such as differentiation and maturation. Moreover, our benchmarking analyses also demonstrated the outperformance of VAPOR over other methods. Additionally, we applied VAPOR to spatial transcriptomics data in the human dorsolateral prefrontal cortex. VAPOR captured the 'inside-out' pattern across cortical layers, potentially revealing how layers were formed, characterized by their gene expression dynamics. Conclusion: VAPOR is open source for general use (https://github.com/daifengwanglab/VAPOR) to parameterize and infer developmental gene expression dynamics. It can be further extended for other single-cell and spatial omics such as chromatin accessibility to reveal developmental epigenomic dynamics.
Bioinformatics
What problem does this paper attempt to address?