A Complete Recipe for Diffusion Generative Models

Kushagra Pandey,Stephan Mandt

2023-10-12

Abstract:Score-based Generative Models (SGMs) have demonstrated exceptional synthesis outcomes across various tasks. However, the current design landscape of the forward diffusion process remains largely untapped and often relies on physical heuristics or simplifying assumptions. Utilizing insights from the development of scalable Bayesian posterior samplers, we present a complete recipe for formulating forward processes in SGMs, ensuring convergence to the desired target distribution. Our approach reveals that several existing SGMs can be seen as specific manifestations of our framework. Building upon this method, we introduce Phase Space Langevin Diffusion (PSLD), which relies on score-based modeling within an augmented space enriched by auxiliary variables akin to physical phase space. Empirical results exhibit the superior sample quality and improved speed-quality trade-off of PSLD compared to various competing approaches on established image synthesis benchmarks. Remarkably, PSLD achieves sample quality akin to state-of-the-art SGMs (FID: 2.10 for unconditional CIFAR-10 generation). Lastly, we demonstrate the applicability of PSLD in conditional synthesis using pre-trained score networks, offering an appealing alternative as an SGM backbone for future advancements. Code and model checkpoints can be accessed at \url{<a class="link-external link-https" href="https://github.com/mandt-lab/PSLD" rel="external noopener nofollow">this https URL</a>}.

Machine Learning,Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The paper attempts to address the lack of a systematic framework for designing the forward diffusion process in Score-Based Generative Models (SGMs). Specifically, existing forward diffusion process designs often rely on physical intuition or simplified assumptions, which limits their generalizability and performance improvement across different tasks. Therefore, the authors propose a novel framework for formulating the forward diffusion process and ensuring its convergence to the desired target distribution. Based on this framework, they introduce a method called "Phase Space Langevin Diffusion" (PSLD), which incorporates noise in both the data space and the auxiliary variable space, thereby achieving better sample quality and faster convergence. In summary, the main objectives of the paper include: 1. Proposing a complete formulation for the forward diffusion process design, ensuring its convergence to the specified target distribution. 2. Introducing the PSLD method, which performs diffusion in the joint space, demonstrating its superiority in image synthesis tasks. 3. Showcasing the performance advantages of PSLD over other baseline methods in unconditional image generation tasks, and its ability to perform conditional synthesis tasks based on pre-trained models.

A Complete Recipe for Diffusion Generative Models

Preconditioned Score-based Generative Models

Score-Based Generative Modeling with Critically-Damped Langevin Diffusion

A Score-Based Density Formula, with Applications in Diffusion Generative Models

Accelerating Score-based Generative Models with Preconditioned Diffusion Sampling

Approximated Anomalous Diffusion: Gaussian Mixture Score-based Generative Models

Simplified Diffusion Schrödinger Bridge

Accelerating Score-based Generative Models for High-Resolution Image Synthesis

Exploring Guided Sampling of Conditional GANs

Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions

Score-Based Graph Generative Modeling with Self-Guided Latent Diffusion

Closed-Form Diffusion Models

The Unreasonable Effectiveness of Gaussian Score Approximation for Diffusion Models and its Applications

Where to Diffuse, How to Diffuse, and How to Get Back: Automated Learning for Multivariate Diffusions

Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC

The Score-Difference Flow for Implicit Generative Modeling

DiffGS: Functional Gaussian Splatting Diffusion

Adapting Diffusion Models for Improved Prompt Compliance and Controllable Image Synthesis

Diffusion Schrödinger Bridge with Applications to Score-Based Generative Modeling

Multilevel Diffusion: Infinite Dimensional Score-Based Diffusion Models for Image Generation

Elucidating the Design Space of Diffusion-Based Generative Models