Sequential Flow Straightening for Generative Modeling

Jongmin Yoon,Juho Lee
2024-02-09
Abstract:Straightening the probability flow of the continuous-time generative models, such as diffusion models or flow-based models, is the key to fast sampling through the numerical solvers, existing methods learn a linear path by directly generating the probability path the joint distribution between the noise and data distribution. One key reason for the slow sampling speed of the ODE-based solvers that simulate these generative models is the global truncation error of the ODE solver, caused by the high curvature of the ODE trajectory, which explodes the truncation error of the numerical solvers in the low-NFE regime. To address this challenge, We propose a novel method called SeqRF, a learning technique that straightens the probability flow to reduce the global truncation error and hence enable acceleration of sampling and improve the synthesis quality. In both theoretical and empirical studies, we first observe the straightening property of our SeqRF. Through empirical evaluations via SeqRF over flow-based generative models, We achieve surpassing results on CIFAR-10, CelebA-$64 \times 64$, and LSUN-Church datasets.
Machine Learning,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem addressed by this paper is how to reduce the global truncation error of numerical solvers in continuous-time generative models in order to speed up sampling and improve the quality of image synthesis. Existing methods learn linear paths by directly generating probabilistic paths between noise and data distributions, but still face challenges of slow sampling speed and unstable training. To solve this problem, the paper proposes the "Sequential Reflow" (SEQ RF) method, which smooths the probability flow through temporal subdivision, reduces the global truncation error, and improves flow-based generative modeling. Experiments show that this method achieves results surpassing existing methods on CIFAR-10, CelebA-64 × 64, and LSUN-Church datasets.