Edge-preserving noise for diffusion models

Jente Vandersanden,Sascha Holl,Xingchang Huang,Gurprit Singh
2024-10-25
Abstract:Classical generative diffusion models learn an isotropic Gaussian denoising process, treating all spatial regions uniformly, thus neglecting potentially valuable structural information in the data. Inspired by the long-established work on anisotropic diffusion in image processing, we present a novel edge-preserving diffusion model that is a generalization of denoising diffusion probablistic models (DDPM). In particular, we introduce an edge-aware noise scheduler that varies between edge-preserving and isotropic Gaussian noise. We show that our model's generative process converges faster to results that more closely match the target distribution. We demonstrate its capability to better learn the low-to-mid frequencies within the dataset, which plays a crucial role in representing shapes and structural information. Our edge-preserving diffusion process consistently outperforms state-of-the-art baselines in unconditional image generation. It is also more robust for generative tasks guided by a shape-based prior, such as stroke-to-image generation. We present qualitative and quantitative results showing consistent improvements (FID score) of up to 30% for both tasks. We provide source code and supplementary content via the public domain <a class="link-external link-http" href="http://edge-preserving-diffusion.mpi-inf.mpg.de" rel="external noopener nofollow">this http URL</a> .
Computer Vision and Pattern Recognition,Artificial Intelligence,Graphics,Machine Learning
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem that traditional Generative Diffusion Models ignore structural information when processing data. Specifically: 1. **Limitations of traditional diffusion models**: - Classic generative diffusion models use an isotropic Gaussian denoising process, treating all spatial regions equally and ignoring potentially valuable structural information in the data. - In the reverse process, the model learns an isotropic denoising process, which ignores the non - isotropic structural content in data samples. 2. **Introducing edge - preserving noise**: - Inspired by anisotropic diffusion in the field of image processing, the authors propose a new edge - preserving diffusion model, which is a generalization of the Denoising Diffusion Probabilistic Model (DDPM). - This model introduces an edge - aware noise scheduler, which can vary between edge - preserving noise and isotropic Gaussian noise. 3. **Improving generation quality and efficiency**: - The authors show that their model converges more quickly to results closer to the target distribution during the generation process. - The model can better learn the low - to mid - frequency components in the data set, which is crucial for representing shape and structural information. - In the unconditional image generation task, the edge - preserving diffusion process outperforms the existing state - of - the - art baseline models. - For generation tasks guided by shape priors (such as stroke - to - image generation), the model also shows higher robustness and better quality. ### Summary By introducing edge - preserving noise, this paper solves the problem that traditional diffusion models ignore structural information when processing data, thereby improving the quality and efficiency of generated images and showing significant advantages in multiple tasks.