Abstract:Denoising Diffusion Probabilistic Models (DDPMs) exhibit remarkable capabilities in image generation, with studies suggesting that they can generalize by composing latent factors learned from the training data. In this work, we go further and study DDPMs trained on strictly separate subsets of the data distribution with large gaps on the support of the latent factors. We show that such a model can effectively generate images in the unexplored, intermediate regions of the distribution. For instance, when trained on clearly smiling and non-smiling faces, we demonstrate a sampling procedure which can generate slightly smiling faces without reference images (zero-shot interpolation). We replicate these findings for other attributes as well as other datasets. Our code is available at <a class="link-external link-https" href="https://github.com/jdeschena/ddpm-zero-shot-interpolation" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The paper primarily explores the unique capabilities of Denoising Diffusion Probabilistic Models (DDPMs) in image generation, particularly their ability to perform zero-shot interpolation outside the training data distribution. Specifically, the paper addresses the following key issues: 1. **Research Background and Motivation**: - Existing research indicates that DDPMs can generate new images by combining latent factors learned from training data, a phenomenon known as "compositionality." - The authors further investigate whether DDPMs can perform interpolation—generating images between intermediate values of latent factors that were not present in the training data. 2. **Methodology**: - The authors define a special data generation model where the training data only includes extreme examples (e.g., faces with very big smiles or no smiles at all), excluding examples of intermediate states. - A sampling method called "multi-guidance" is used, which leverages the scores of multiple classifiers to guide the generation process, effectively generating images of intermediate states. - A filtering process is proposed to extract extreme examples from real datasets, ensuring that the training dataset meets the requirements for interpolation experiments. 3. **Main Contributions**: - Demonstrated that DDPMs can effectively generate images with intermediate attributes even when trained only on extreme examples, a phenomenon referred to as zero-shot interpolation. - Validated this finding on real-world datasets (e.g., CelebA) and synthetic datasets. - Explored the impact of different training settings, hyperparameter choices, and model architectures on interpolation performance. - Showed that DDPMs maintain interpolation capabilities even with smaller amounts of data. 4. **Empirical Results**: - On the CelebA dataset, for the attribute "smile," the trained DDPMs could generate images with smile intensities between a big smile and no smile, despite the absence of such intermediate state samples in the training data. - Experimental results indicate that interpolation performance decreases as the amount of training data is reduced, but DDPMs still exhibit some interpolation capability even with smaller datasets. - The multi-guidance method is relatively stable to changes in the guidance parameter λ, indicating good robustness of this approach. In summary, this paper demonstrates that DDPMs surpass simple compositional capabilities in image generation and can perform interpolation outside the training data distribution. This has significant implications for addressing fairness and bias mitigation issues in machine learning.

Going beyond Compositions, DDPMs Can Produce Zero-Shot Interpolations

Multi-Step Denoising Scheduled Sampling: Towards Alleviating Exposure Bias for Diffusion Models

UDPM: Upsampling Diffusion Probabilistic Models

ShiftDDPMs: Exploring Conditional Diffusion Models by Shifting Diffusion Trajectories

Few-shot Image Generation with Diffusion Models

Denoising diffusion probabilistic models are optimally adaptive to unknown low dimensionality

Improving Denoising Diffusion Probabilistic Models via Exploiting Shared Representations

Discovery and Expansion of New Domains within Diffusion Models

An Edit Friendly DDPM Noise Space: Inversion and Manipulations

Progressive Image Synthesis from Semantics to Details with Denoising Diffusion GAN

Structured Denoising Diffusion Models in Discrete State-Spaces

A Survey of Data-Driven 2D Diffusion Models for Generating Images from Text

Think While You Generate: Discrete Diffusion with Planned Denoising

Pseudo Numerical Methods for Diffusion Models on Manifolds

Denoising Diffusion Probabilistic Models in Six Simple Steps

Accelerating Diffusion Models via Early Stop of the Diffusion Process

Denoising Diffusion Step-aware Models

Enhancing Diffusion Models for High-Quality Image Generation

Mix-DDPM: Enhancing Diffusion Models Through Fitting Mixture Noise with Global Stochastic Offset

SatDM: Synthesizing Realistic Satellite Image with Semantic Layout Conditioning using Diffusion Models