An Edit Friendly DDPM Noise Space: Inversion and Manipulations

Inbar Huberman-Spiegelglas,Vladimir Kulikov,Tomer Michaeli

2024-04-10

Abstract:Denoising diffusion probabilistic models (DDPMs) employ a sequence of white Gaussian noise samples to generate an image. In analogy with GANs, those noise maps could be considered as the latent code associated with the generated image. However, this native noise space does not possess a convenient structure, and is thus challenging to work with in editing tasks. Here, we propose an alternative latent noise space for DDPM that enables a wide range of editing operations via simple means, and present an inversion method for extracting these edit-friendly noise maps for any given image (real or synthetically generated). As opposed to the native DDPM noise space, the edit-friendly noise maps do not have a standard normal distribution and are not statistically independent across timesteps. However, they allow perfect reconstruction of any desired image, and simple transformations on them translate into meaningful manipulations of the output image (e.g. shifting, color edits). Moreover, in text-conditional models, fixing those noise maps while changing the text prompt, modifies semantics while retaining structure. We illustrate how this property enables text-based editing of real images via the diverse DDPM sampling scheme (in contrast to the popular non-diverse DDIM inversion). We also show how it can be used within existing diffusion-based editing methods to improve their quality and diversity. Webpage:

Machine Learning

What problem does this paper attempt to address?

This paper presents a solution to a problem in image editing based on Denoising Diffusion Probabilistic Models (DDPM). In DDPM, the original noise space is not conducive to editing tasks because the noise mappings are difficult to handle and edit. The paper proposes a new latent noise space that allows for diverse editing of real images without the need to fine-tune the model or modify attention maps, and can be easily integrated into other algorithms. Specifically, the main contributions of the paper include: 1. Introducing a method to extract a series of DDPM noise mappings from a given image (whether real or synthetic) that can perfectly reconstruct the image, and these noise mappings have editing-friendly characteristics. 2. These editing-friendly noise mappings do not follow a standard normal distribution and are not statistically independent between time steps, but they allow for meaningful transformations of the output image, such as translation and color editing. 3. By fixing these noise mappings in a text-conditioned model and changing the text prompts, it is possible to change the semantics while preserving the structure, which is particularly useful for text-guided editing using DDPM sampling schemes. 4. The paper also demonstrates how to integrate this method with existing diffusion-based editing methods to improve their quality and diversity. The paper showcases the superiority of the new approach in text-guided editing tasks through experiments, including text editing while preserving the input image structure, faster editing speed, and more diverse results. Additionally, compared to other DDIM-based inversion methods, the new approach performs better in terms of structural fidelity.

An Edit Friendly DDPM Noise Space: Inversion and Manipulations

There and Back Again: On the relation between noises, images, and their inversions in diffusion models

Multi-Step Denoising Scheduled Sampling: Towards Alleviating Exposure Bias for Diffusion Models

Structured Denoising Diffusion Models in Discrete State-Spaces

Edge-preserving noise for diffusion models

Drag Your Noise: Interactive Point-based Editing via Diffusion Semantic Propagation

Think While You Generate: Discrete Diffusion with Planned Denoising

UDPM: Upsampling Diffusion Probabilistic Models

Schedule Your Edit: A Simple yet Effective Diffusion Noise Schedule for Image Editing

Noise Map Guidance: Inversion with Spatial Context for Real Image Editing

Enabling Local Editing in Diffusion Models by Joint and Individual Component Analysis

Diffusion Model-Based Image Editing: A Survey

Listening to the Noise: Blind Denoising with Gibbs Diffusion

Boundary Guided Learning-Free Semantic Control with Diffusion Models

TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models

Going beyond Compositions, DDPMs Can Produce Zero-Shot Interpolations

EDICT: Exact Diffusion Inversion via Coupled Transformations

Source Prompt Disentangled Inversion for Boosting Image Editability with Diffusion Models

Discovering Interpretable Directions in the Semantic Latent Space of Diffusion Models