Abstract:Practitioners often aim to infer an unobserved population trajectory using sample snapshots at multiple time points. E.g., given single-cell sequencing data, scientists would like to learn how gene expression changes over a cell's life cycle. But sequencing any cell destroys that cell. So we can access data for any particular cell only at a single time point, but we have data across many cells. The deep learning community has recently explored using Schrödinger bridges (SBs) and their extensions in similar settings. However, existing methods either (1) interpolate between just two time points or (2) require a single fixed reference dynamic (often set to Brownian motion within SBs). But learning piecewise from adjacent time points can fail to capture long-term dependencies. And practitioners are typically able to specify a model family for the reference dynamic but not the exact values of the parameters within it. So we propose a new method that (1) learns the unobserved trajectories from sample snapshots across multiple time points and (2) requires specification only of a family of reference dynamics, not a single fixed one. We demonstrate the advantages of our method on simulated and real data.
What problem does this paper attempt to address?
### Problems the paper attempts to solve
The paper aims to solve the problem of inferring unobserved population trajectories from sample snapshots at multiple time points. Specifically, researchers hope to use single - cell sequencing data to understand how gene expression changes during the cell life cycle. However, sequencing any cell will destroy that cell, so we can only obtain data for a specific cell at a certain time point, but we can obtain data across multiple cells.
Existing methods have the following limitations:
1. **Interpolate only between two time points**: Many methods can only interpolate between two time points and cannot capture long - term dependencies.
2. **Require a fixed reference dynamic**: Existing methods usually require a fixed reference dynamic (usually Brownian motion), but in practical applications, researchers can usually only specify the model family of the reference dynamic, not the specific parameter values within it.
To solve these problems, the authors propose a new method that:
1. **Learns unobserved trajectories from sample snapshots at multiple time points**.
2. **Only needs to specify the model family of the reference dynamic, rather than a single fixed reference dynamic**.
### Method overview
The authors' method is achieved through iterative optimization steps:
1. **Given the current best estimate of the reference dynamic, learn the piecewise Schrödinger bridge and generate trajectories**.
2. **Use the generated trajectories to improve the estimate of the underlying dynamic**.
This method allows information to be shared between different time intervals and can improve the accuracy of trajectory inference by specifying the model family of the reference dynamic.
### Experimental results
The authors conducted experiments on simulated data and real data, and the results show that their method is not only more accurate than existing methods but also significantly improves computational efficiency. Specific experiments include:
1. **Lotka - Volterra predator - prey model**: On synthetic data, the authors' method can better capture the curvature of the dynamics and is much faster in terms of computation time than other methods.
2. **Repressilator model**: On data simulating the circadian rhythm of cyanobacteria, the authors' method can accurately capture the cyclic nature, while other methods fail.
3. **Gulf Stream data**: On real ocean data, although the reference dynamic model does not completely match the data - generation process, the authors' method still performs well.
### Conclusion
The method proposed in this paper can more accurately infer unobserved trajectories from sample snapshots at multiple time points through iterative optimization steps and has high computational efficiency. This provides a powerful tool for dynamic system analysis in biology, medicine, and other fields.