Diffusion Model-Based Image Editing: A Survey

Yi Huang,Jiancheng Huang,Yifan Liu,Mingfu Yan,Jiaxi Lv,Jianzhuang Liu,Wei Xiong,He Zhang,Shifeng Chen,Liangliang Cao

2024-03-16

Abstract:Denoising diffusion models have emerged as a powerful tool for various image generation and editing tasks, facilitating the synthesis of visual content in an unconditional or input-conditional manner. The core idea behind them is learning to reverse the process of gradually adding noise to images, allowing them to generate high-quality samples from a complex distribution. In this survey, we provide an exhaustive overview of existing methods using diffusion models for image editing, covering both theoretical and practical aspects in the field. We delve into a thorough analysis and categorization of these works from multiple perspectives, including learning strategies, user-input conditions, and the array of specific editing tasks that can be accomplished. In addition, we pay special attention to image inpainting and outpainting, and explore both earlier traditional context-driven and current multimodal conditional methods, offering a comprehensive analysis of their methodologies. To further evaluate the performance of text-guided image editing algorithms, we propose a systematic benchmark, EditEval, featuring an innovative metric, LMM Score. Finally, we address current limitations and envision some potential directions for future research. The accompanying repository is released at

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The paper attempts to address the issue of providing a comprehensive review in the field of image editing based on diffusion models. Specifically, the paper aims to: 1. **Provide an exhaustive overview of existing methods**: Covering various approaches to image editing techniques based on diffusion models in both theory and practice. 2. **Conduct an in-depth analysis and classification of these methods**: Offering detailed analysis and classification of these methods from multiple perspectives, such as learning strategies, user input conditions, and specific editing tasks. 3. **Pay special attention to image inpainting and extrapolation**: Exploring early traditional context-driven methods and current multimodal conditional methods, and providing a comprehensive analysis of their methodologies. 4. **Propose systematic benchmarking**: To evaluate the performance of text-guided image editing algorithms, a new benchmarking framework called EditEval is proposed, along with the introduction of an innovative evaluation metric, the LMM Score. 5. **Discuss current limitations and future research directions**: Pointing out the shortcomings of current research and envisioning potential future developments. Through these objectives, the paper hopes to provide researchers in the field of image editing based on diffusion models with a systematic resource that not only summarizes current research achievements but also guides future research directions.

Diffusion Model-Based Image Editing: A Survey

High-Fidelity Diffusion-based Image Editing

A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models

DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image Editing

Diffusion Model-Based Video Editing: A Survey

Diffusion Models in Vision: A Survey

Unsupervised Region-Based Image Editing of Denoising Diffusion Models

Stimulating Diffusion Model for Image Denoising via Adaptive Embedding and Ensembling

Stimulating the Diffusion Model for Image Denoising Via Adaptive Embedding and Ensembling

Differential Diffusion: Giving Each Pixel Its Strength

PAIR-Diffusion: A Comprehensive Multimodal Object-Level Image Editor

Diffusion Brush: A Latent Diffusion Model-based Editing Tool for AI-generated Images

Diffusion Models in Low-Level Vision: A Survey

Not All Steps Are Created Equal: Selective Diffusion Distillation for Image Manipulation

Diffusion Models for Image Restoration and Enhancement - A Comprehensive Survey

Guided Image Synthesis via Initial Image Editing in Diffusion Model

Diffusion Cocktail: Mixing Domain-Specific Diffusion Models for Diversified Image Generations

Conditional Image Synthesis with Diffusion Models: A Survey

eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers

Collaborative Diffusion for Multi-Modal Face Generation and Editing