Abstract:Diffusion models have recently exhibited remarkable abilities to synthesize striking image samples since the introduction of denoising diffusion probabilistic models (DDPMs). Their key idea is to disrupt images into noise through a fixed forward process and learn its reverse process to generate samples from noise in a denoising way. For conditional DDPMs, most existing practices relate conditions only to the reverse process and fit it to the reversal of unconditional forward process. We find this will limit the condition modeling and generation in a small time window. In this paper, we propose a novel and flexible conditional diffusion model by introducing conditions into the forward process. We utilize extra latent space to allocate an exclusive diffusion trajectory for each condition based on some shifting rules, which will disperse condition modeling to all timesteps and improve the learning capacity of model. We formulate our method, which we call \textbf{ShiftDDPMs}, and provide a unified point of view on existing related methods. Extensive qualitative and quantitative experiments on image synthesis demonstrate the feasibility and effectiveness of ShiftDDPMs.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to use conditional information more effectively to improve the quality of image generation in Conditional Diffusion Models (CDMs). Specifically, existing conditional diffusion models mainly introduce conditional information in the reverse process, which limits the time window for conditional modeling and generation and restricts the learning ability of the model. The paper proposes a new method - ShiftDDPMs. By introducing conditional information in the forward process, the diffusion trajectories under different conditions can remain separated throughout all time steps, thereby enhancing the learning ability and generation effect of the model. ### Main Contributions 1. **Systematically introduce the conditional forward process**: For the first time, the paper systematically explores how to design controllable condition - dependent diffusion trajectories in diffusion models and provides a unified perspective to understand existing related methods. 2. **Improve the utilization of latent space and the learning ability of the model**: By shifting the diffusion trajectories, ShiftDDPMs can make better use of the latent space and improve the learning ability of the model. 3. **Extensive experimental verification**: The paper verifies the feasibility and effectiveness of ShiftDDPMs in various image synthesis tasks through a large number of qualitative and quantitative experiments. ### Method Overview - **Conditional forward process**: The paper proposes a conditional forward process to shift the diffusion trajectory by introducing conditional information at each time step. The specific form is as follows: \[ q(x_t|x_0, c)=\mathcal{N}(\sqrt{\bar{\alpha}_t}x_0 + s_t,(1 - \bar{\alpha}_t)\Sigma) \] where \(s_t = k_t\cdot E(c)\) is the cumulative mean shift of the diffusion trajectory at step \(t\), \(k_t\) is a shift coefficient schedule that determines the shift pattern, and \(E(c)\) is a function that maps the condition to the latent space. - **Parameterized reverse process**: The paper also proposes a parameterized reverse process to fit the inverse process of the forward process. The specific form is as follows: \[ p_\theta(x_{t - 1}|x_t, c)=\mathcal{N}\left(\frac{1}{\sqrt{\alpha_t}}\left[x_t-\beta_t\sqrt{1 - \bar{\alpha}_t}g_\theta(x_t, t)\right]-\sqrt{\alpha_t}\frac{1 - \bar{\alpha}_{t - 1}}{1 - \bar{\alpha}_t}s_t + s_{t - 1},\frac{1 - \bar{\alpha}_{t - 1}}{1 - \bar{\alpha}_t}\beta_t\Sigma\right) \] ### Experimental Results - **Effectiveness of conditional sampling**: On the MNIST dataset, the paper verifies the effectiveness of different shift patterns (Prior - Shift, Data - Normalization, Quadratic - Shift), and all models can successfully generate conditional images. - **Sample quality assessment**: On the CIFAR - 10 dataset, the paper evaluates the performance of different models through metrics such as Inception Score, FID, and negative log - likelihood. The results show that ShiftDDPMs outperforms traditional conditional diffusion models on multiple metrics. - **Fast sampling adaptability**: The paper also adapts ShiftDDPMs to the DDIMs framework to achieve fast sampling while maintaining high - quality generation. - **Diffusion trajectory interpolation**: By interpolating diffusion trajectories under different conditions to generate images with mixed features, the paper verifies the decoupling characteristics of diffusion trajectories. - **Image inpainting and text - to - image generation**: The paper also conducts experiments on image inpainting and text - to - image generation tasks to further verify the effectiveness of ShiftDDPMs. ### Conclusion The paper proposes...

ShiftDDPMs: Exploring Conditional Diffusion Models by Shifting Diffusion Trajectories

Multi-Step Denoising Scheduled Sampling: Towards Alleviating Exposure Bias for Diffusion Models

Entropy-Driven Sampling and Training Scheme for Conditional Diffusion Generation.

Going beyond Compositions, DDPMs Can Produce Zero-Shot Interpolations

Accelerating Diffusion Models via Early Stop of the Diffusion Process

Diffusion Probabilistic Model Made Slim

Denoising Diffusion Probabilistic Models for Action-Conditioned 3D Motion Generation.

Improving Denoising Diffusion Probabilistic Models via Exploiting Shared Representations

Conditional Image Synthesis with Diffusion Models: A Survey

Med-cDiff: Conditional Medical Image Generation with Diffusion Models

UDPM: Upsampling Diffusion Probabilistic Models

Modiff: Action-Conditioned 3D Motion Generation with Denoising Diffusion Probabilistic Models

Conditional Generation from Unconditional Diffusion Models using Denoiser Representations

Discrete Modeling via Boundary Conditional Diffusion Processes

Denoising Diffusion Step-aware Models

Discrete Contrastive Diffusion for Cross-Modal Music and Image Generation

PartDiff: Image Super-resolution with Partial Diffusion Models

Advancing Pose-Guided Image Synthesis with Progressive Conditional Diffusion Models

Structured Denoising Diffusion Models in Discrete State-Spaces

Discovery and Expansion of New Domains within Diffusion Models

Efficient Denoising Diffusion Via Probabilistic Masking