Abstract:Denoising diffusion probabilistic models for image inpainting aim to add the noise to the texture of image during the forward process and recover masked regions with unmasked ones of the texture via the reverse denoising process. Despite the meaningful semantics generation, the existing arts suffer from the semantic discrepancy between masked and unmasked regions, since the semantically dense unmasked texture fails to be completely degraded while the masked regions turn to the pure noise in diffusion process, leading to the large discrepancy between them. In this paper, we aim to answer how unmasked semantics guide texture denoising process;together with how to tackle the semantic discrepancy, to facilitate the consistent and meaningful semantics generation. To this end, we propose a novel structure-guided diffusion model named StrDiffusion, to reformulate the conventional texture denoising process under structure guidance to derive a simplified denoising objective for image inpainting, while revealing: 1) the semantically sparse structure is beneficial to tackle semantic discrepancy in early stage, while dense texture generates reasonable semantics in late stage; 2) the semantics from unmasked regions essentially offer the time-dependent structure guidance for the texture denoising process, benefiting from the time-dependent sparsity of the structure semantics. For the denoising process, a structure-guided neural network is trained to estimate the simplified denoising objective by exploiting the consistency of the denoised structure between masked and unmasked regions. Besides, we devise an adaptive resampling strategy as a formal criterion as whether structure is competent to guide the texture denoising process, while regulate their semantic correlations. Extensive experiments validate the merits of StrDiffusion over the state-of-the-arts. Our code is available at

A Diffusion Model with A FFT for Image Inpainting

Rethinking Fast Fourier Convolution in Image Inpainting

Diffusion Models with Anisotropic Gaussian Splatting for Image Inpainting

FFTI: Image Inpainting Algorithm via Features Fusion and Two-Steps Inpainting

TD-Paint: Faster Diffusion Inpainting Through Time Aware Pixel Conditioning

Video Diffusion Models are Strong Video Inpainter

A Double Feature Fusion Network with Progressive Learning for Sharper Inpainting

Towards Coherent Image Inpainting Using Denoising Diffusion Implicit Models

Structure Matters: Tackling the Semantic Discrepancy in Diffusion Models for Image Inpainting

Image inpainting algorithm based on double curvature-driven diffusion model with P-Laplace operator

Reconstruct-and-Generate Diffusion Model for Detail-Preserving Image Denoising

Anti-forensics of Diffusion-Based Image Inpainting.

Flow-Guided Diffusion for Video Inpainting

Diffusion-Based Image Inpainting Forensics Via Gradient Domain Guided Filtering Enhancement

Minutes to Seconds: Speeded-up DDPM-based Image Inpainting with Coarse-to-Fine Sampling

High-Fidelity Diffusion-based Image Editing

3D-Consistent Image Inpainting with Diffusion Models

MMGInpainting: Multi-Modality Guided Image Inpainting Based On Diffusion Models

Efficient Parallel Data Optimization for Homogeneous Diffusion Inpainting of 4K Images

Uni-paint: A Unified Framework for Multimodal Image Inpainting with Pretrained Diffusion Model

FAM Diffusion: Frequency and Attention Modulation for High-Resolution Image Generation with Stable Diffusion