Learning cell fate landscapes from spatial transcriptomics using Fused Gromov-Wasserstein

Geert-Jan Huizing,Gabriel Peyre,Laura Cantini
DOI: https://doi.org/10.1101/2024.07.26.605241
2024-07-26
Abstract:In dynamic biological processes such as development, spatial transcriptomics is revolutionizing the study of the mechanisms underlying spatial organization within tissues. Inferring cell fate trajectories from spatial transcriptomics profiled at several time points has thus emerged as a critical goal, requiring novel computational methods. Wasserstein gradient flow learning is a promising framework for analyzing sequencing data across time, built around a neural network representing the differentiation potential. However, existing gradient flow learning methods cannot analyze spatially resolved transcriptomic data. Here, we propose STORIES, a method that employs an extension of Optimal Transport to learn a spatially informed potential. We benchmark our approach using three large Stereo-seq spatiotemporal atlases and demonstrate superior spatial coherence compared to existing approaches. Finally, we provide an in-depth analysis of axolotl neural regeneration and mouse gliogenesis, recovering gene trends for known markers as Nptx1 in neuron regeneration and Aldh1l1 in gliogenesis and additional putative drivers.
Bioinformatics
What problem does this paper attempt to address?
This paper attempts to address the problem of inferring cell fate trajectories from spatiotemporal transcriptomics data in dynamic biological processes (such as development, disease occurrence, etc.). Specifically, while existing methods can handle single-cell sequencing data, they cannot effectively analyze transcriptomics data with spatiotemporal resolution. Therefore, this paper proposes a new method—STORIES, which extends the Fused Gromov-Wasserstein (FGW) distance in Optimal Transport (OT) theory to learn a differentiation potential function that includes spatial information. ### Main Issues: 1. **Limitations of Existing Methods**: - Existing trajectory inference methods (such as pseudotime-based methods and velocity-based methods) cannot provide a complete differentiation model and cannot predict the future transcriptional state of cells. - Existing optimal transport-based methods (such as Waddington OT) can infer trajectories but cannot handle transcriptomics data with spatiotemporal resolution. 2. **Special Challenges of Spatiotemporal Transcriptomics Data**: - Spatiotemporal transcriptomics data includes not only gene expression information but also spatial location information. Existing methods cannot effectively integrate this information. - Slices of data at different time points may not be perfectly aligned, requiring handling of changes in spatial coordinates (such as rotation and translation). ### Solution: - **STORIES Method**: - STORIES extends the Fused Gromov-Wasserstein (FGW) distance to handle changes in spatial coordinates while processing gene expression data. - This method learns a neural network that maps each cell's gene expression profile to a differentiation potential value, thereby defining the cell's differentiation stage. - STORIES can not only predict cell states at future time points but also provide biologically meaningful outputs, such as the order of cell differentiation and the direction of changes in gene expression. ### Experimental Validation: - **Benchmark Testing**: - The authors benchmarked STORIES on three large-scale Stereo-seq spatiotemporal maps, including mouse development, zebrafish development, and Mexican axolotl brain regeneration. - Results show that STORIES outperforms existing linear methods in both gene expression prediction and spatial consistency. - **Application Cases**: - **Mexican Axolotl Neural Regeneration**: STORIES successfully identified cell fate trajectories during neural regeneration and discovered known marker genes (such as Nptx1) and other potential driver genes. - **Mouse Dorsal Midbrain Gliogenesis**: STORIES identified cell fate trajectories during gliogenesis and discovered known marker genes (such as Aldh1l1) and other potential driver genes. ### Summary: This paper proposes a new computational framework, STORIES, which can infer cell fate trajectories from spatiotemporal transcriptomics data and has been validated for its effectiveness and biological relevance in multiple biological processes. STORIES not only provides accurate gene expression predictions but also maintains spatial consistency, offering a powerful tool for understanding complex biological processes.