Abstract:The diffusion model has shown remarkable success in computer vision, but it remains unclear whether the ODE-based probability flow or the SDE-based diffusion model is more superior and under what circumstances. Comparing the two is challenging due to dependencies on data distributions, score training, and other numerical issues. In this paper, we study the problem mathematically for two limiting scenarios: the zero diffusion (ODE) case and the large diffusion case. We first introduce a pulse-shape error to perturb the score function and analyze error accumulation of sampling quality, followed by a thorough analysis for generalization to arbitrary error. Our findings indicate that when the perturbation occurs at the end of the generative process, the ODE model outperforms the SDE model with a large diffusion coefficient. However, when the perturbation occurs earlier, the SDE model outperforms the ODE model, and we demonstrate that the error of sample generation due to such a pulse-shape perturbation is exponentially suppressed as the diffusion term's magnitude increases to infinity. Numerical validation of this phenomenon is provided using Gaussian, Gaussian mixture, and Swiss roll distribution, as well as realistic datasets like MNIST and CIFAR-10.

What problem does this paper attempt to address?

The paper primarily explores the optimal choice of the generation process in diffusion models, specifically comparing the probability flow model based on ordinary differential equations (ODE) with the diffusion model based on stochastic differential equations (SDE). Specifically, the core issue of the study is to determine under what circumstances the ODE model or the SDE model is superior. The main contributions of the paper are as follows: 1. **Theoretical Analysis**: - When there is an error in training the score function, and this error only occurs at the beginning of the inference step (i.e., at the end of the generation process), the performance of the ODE model (\(h=0\)) is superior to that of the SDE model (\(h \rightarrow \infty\)). (See Prop. 3.5) - If the aforementioned error occurs in the middle stage of the process, then as \(h \rightarrow \infty\), the performance of the SDE model will be exponentially superior to that of the ODE model. (See Prop. 3.4) 2. **General Case of Error**: - For general score training errors, when \(h \rightarrow \infty\), the leading term of the generation sample error \(L(h)\) will exponentially converge to a constant that only depends on the data distribution and the score training error at the end of the generation process. (See Prop. 3.6) 3. **Numerical Validation**: - The paper conducts numerical validation through Gaussian distribution, Gaussian mixture distribution, Swiss roll distribution, and real datasets MNIST and CIFAR-10 to support the above theoretical results. In summary, through theoretical analysis and numerical experiments, the paper reveals the different performances of the ODE model and the SDE model in handling score function errors under different conditions, providing deep insights into the choice between the two models in diffusion models. Additionally, the paper proposes a possible method to adjust the loss function during the training process to adapt to specific diffusion coefficients, providing a direction for further research.

Exploring the Optimal Choice for Generative Processes in Diffusion Models: Ordinary vs Stochastic Differential Equations

An optimal control perspective on diffusion-based generative modeling

Evaluating the design space of diffusion-based generative models

Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative Models

A Geometric Perspective on Diffusion Models

A Flexible Diffusion Model

Analyzing Neural Network-Based Generative Diffusion Models through Convex Optimization

Improved Techniques for Maximum Likelihood Estimation for Diffusion ODEs

Closing the ODE-SDE gap in score-based diffusion models through the Fokker-Planck equation

A Sharp Convergence Theory for The Probability Flow ODEs of Diffusion Models

A Training-Free Conditional Diffusion Model for Learning Stochastic Dynamical Systems

An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization

Not All Noises Are Created Equally:Diffusion Noise Selection and Optimization

Unraveling the Smoothness Properties of Diffusion Models: A Gaussian Mixture Perspective

New algorithms for sampling and diffusion models

Score-based Generative Modeling Through Backward Stochastic Differential Equations: Inversion and Generation

Diffusion models for Gaussian distributions: Exact solutions and Wasserstein errors

$O(d/T)$ Convergence Theory for Diffusion Probabilistic Models under Minimal Assumptions

Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions

Where to Diffuse, How to Diffuse, and How to Get Back: Automated Learning for Multivariate Diffusions