Simulation-free Schrödinger bridges via score and flow matching

Alexander Tong,Nikolay Malkin,Kilian Fatras,Lazar Atanackovic,Yanlei Zhang,Guillaume Huguet,Guy Wolf,Yoshua Bengio
2024-03-11
Abstract:We present simulation-free score and flow matching ([SF]$^2$M), a simulation-free objective for inferring stochastic dynamics given unpaired samples drawn from arbitrary source and target distributions. Our method generalizes both the score-matching loss used in the training of diffusion models and the recently proposed flow matching loss used in the training of continuous normalizing flows. [SF]$^2$M interprets continuous-time stochastic generative modeling as a Schrödinger bridge problem. It relies on static entropy-regularized optimal transport, or a minibatch approximation, to efficiently learn the SB without simulating the learned stochastic process. We find that [SF]$^2$M is more efficient and gives more accurate solutions to the SB problem than simulation-based methods from prior work. Finally, we apply [SF]$^2$M to the problem of learning cell dynamics from snapshot data. Notably, [SF]$^2$M is the first method to accurately model cell dynamics in high dimensions and can recover known gene regulatory networks from simulated data. Our code is available in the TorchCFM package at <a class="link-external link-https" href="https://github.com/atong01/conditional-flow-matching" rel="external noopener nofollow">this https URL</a>.
Machine Learning
What problem does this paper attempt to address?
### Problems Addressed by the Paper This paper aims to address the simulation-free problem in continuous-time stochastic generative modeling. Specifically, the authors propose a novel method—**Simulation-Free Score and Flow Matching ([SF]²M)**—to infer the stochastic dynamics between samples from any given source distribution and target distribution. #### Main Contributions: 1. **Proposing [SF]²M**: This is the first simulation-free objective function for the Schrödinger bridge problem, and its correctness is proven. 2. **Efficient Empirical and Mini-batch Approximations**: The study explores efficient empirical and mini-batch approximation methods for the entropic optimal transport (OT) plan used in [SF]²M. 3. **Validation of Method Effectiveness**: The proposed method is validated through synthetic distributions and several single-cell dynamics problems. #### Method Overview: - **[SF]²M Algorithm**: Utilizes static entropic optimal transport or mini-batch approximations to efficiently learn the Schrödinger bridge without simulating the learned stochastic process. - **Single-Cell Dynamics Modeling**: [SF]²M is the first to accurately model single-cell dynamics in high dimensions and recover known gene regulatory networks from simulated data. - **Dynamic Gene Regulatory Network Modeling**: Learns gene regulatory networks directly from gene expression data by analyzing the sparse patterns of neural networks. Through these methods, [SF]²M not only improves the efficiency of modeling single-cell data but also reveals the complex dynamical mechanisms of biological systems.