EM Distillation for One-step Diffusion Models

Sirui Xie,Zhisheng Xiao,Diederik P Kingma,Tingbo Hou,Ying Nian Wu,Kevin Patrick Murphy,Tim Salimans,Ben Poole,Ruiqi Gao

2024-05-27

Abstract:While diffusion models can learn complex distributions, sampling requires a computationally expensive iterative process. Existing distillation methods enable efficient sampling, but have notable limitations, such as performance degradation with very few sampling steps, reliance on training data access, or mode-seeking optimization that may fail to capture the full distribution. We propose EM Distillation (EMD), a maximum likelihood-based approach that distills a diffusion model to a one-step generator model with minimal loss of perceptual quality. Our approach is derived through the lens of Expectation-Maximization (EM), where the generator parameters are updated using samples from the joint distribution of the diffusion teacher prior and inferred generator latents. We develop a reparametrized sampling scheme and a noise cancellation technique that together stabilizes the distillation process. We further reveal an interesting connection of our method with existing methods that minimize mode-seeking KL. EMD outperforms existing one-step generative methods in terms of FID scores on ImageNet-64 and ImageNet-128, and compares favorably with prior work on distilling text-to-image diffusion models.

Machine Learning,Artificial Intelligence

What problem does this paper attempt to address?

EM Distillation, proposed in this paper, explores how to efficiently extract generator models from diffusion models to achieve efficient one-step sampling. Diffusion models perform well in generating high-quality images and other modal data, but their sampling process requires multiple iterations, which is computationally expensive. Existing distillation methods can accelerate sampling, but they still have limitations, such as performance degradation with a small number of sampling steps, reliance on training data, or inability to capture patterns that represent the complete distribution. The paper introduces a Maximum Likelihood Estimation (MLE) based method called EM Distillation, which aims to minimize the mode-covering difference between the pretrained diffusion teacher model and the latent variable student model. This method updates the parameters of the student model through the Expectation-Maximization (EM) framework and stabilizes the distillation process using Monte Carlo sampling and noise elimination techniques. The paper also reveals the connection between EM Distillation and existing methods such as Variational Score Distillation and Diff-Instruct, and demonstrates the trade-off between pattern search and mode covering by adjusting the Markov Chain Monte Carlo (MCMC) sampling intensity. Experimental results show that EM Distillation outperforms existing one-step generation methods in terms of FID scores on ImageNet-64 and ImageNet-128 conditional generation tasks, and performs well compared to pretrained diffusion models in text-to-image generation.

EM Distillation for One-step Diffusion Models

One-Step Diffusion Distillation via Deep Equilibrium Models

One-step Diffusion with Distribution Matching Distillation

Multistep Distillation of Diffusion Models via Moment Matching

Relational Diffusion Distillation for Efficient Image Generation

Multi-student Diffusion Distillation for Better One-step Generators

Diffusion Models Are Innate One-Step Generators

Simple and Fast Distillation of Diffusion Models

SFDDM: Single-fold Distillation for Diffusion models

Distilling Diffusion Models into Conditional GANs

Plug-and-Play Diffusion Distillation

One-Step Diffusion Distillation through Score Implicit Matching

Improved Distribution Matching Distillation for Fast Image Synthesis

Efficient Dataset Distillation via Minimax Diffusion

Latent Dataset Distillation with Diffusion Models

Distillation of Discrete Diffusion through Dimensional Correlations

Imagine Flash: Accelerating Emu Diffusion Models with Backward Distillation

Continual Learning of Diffusion Models with Generative Distillation

Reducing Spatial Fitting Error in Distillation of Denoising Diffusion Models

Accelerating Diffusion Models with One-to-Many Knowledge Distillation

Physics Informed Distillation for Diffusion Models