Abstract:Denoising diffusion models have emerged as a dominant approach for image generation, however they still suffer from slow convergence in training and color shift issues in sampling. In this paper, we identify that these obstacles can be largely attributed to bias and suboptimality inherent in the default training paradigm of diffusion models. Specifically, we offer theoretical insights that the prevailing constant loss weight strategy in $\epsilon$-prediction of diffusion models leads to biased estimation during the training phase, hindering accurate estimations of original images. To address the issue, we propose a simple but effective weighting strategy derived from the unlocked biased part. Furthermore, we conduct a comprehensive and systematic exploration, unraveling the inherent bias problem in terms of its existence, impact and underlying reasons. These analyses contribute to advancing the understanding of diffusion models. Empirical results demonstrate that our method remarkably elevates sample quality and displays improved efficiency in both training and sampling processes, by only adjusting loss weighting strategy. The code is released publicly at \url{<a class="link-external link-https" href="https://github.com/yuhuUSTC/Debias" rel="external noopener nofollow">this https URL</a>}

What problem does this paper attempt to address?

This paper attempts to solve two main problems existing in the training process of Diffusion Models: **slow convergence speed** and **color shift problem during sampling**. Specifically, the author finds that these problems are mainly caused by the inherent bias and sub - optimality in the default training paradigm of diffusion models. To explain this in more detail, the paper conducts the following discussions: 1. **Identifying the sources of bias**: - The author points out that in the traditional noise prediction ($\epsilon$-prediction) based on constant weights, the design of the loss function will lead to estimation bias in the training stage, thus affecting the accurate estimation of the original image. - This bias is specifically manifested as that with the increase of the training step $t$, the estimated $\hat{x_0}$ gradually deviates from the real $x_0$, and the amplified error part gradually approaches $x_0$. 2. **Proposing improvement schemes**: - To solve the above problems, the author proposes a simple but effective weighting strategy, that is, using the reciprocal of the square root of the signal - to - noise ratio (SNR) as the weight coefficient of the loss function: \[ L=\sum_{t}\mathbb{E}_{x_0,\epsilon}\left[\frac{1}{\sqrt{\text{SNR}(t)}}\|\epsilon - \epsilon_\theta(x_t,t)\|^2\right] \] - By adjusting the loss weight, the error at a higher noise level can be more significantly reduced, thereby improving the sample quality and training efficiency. 3. **Systematically analyzing the bias problem**: - The author systematically analyzes the bias problem from multiple perspectives, including its existence, influence, and root causes. - Research shows that the optimization difficulty and importance of the denoising network at different steps $t$ vary greatly, especially in the initial steps, where the high noise level leads to greater optimization challenges. - In addition, the bias estimation problem will cause confusion and inconsistency in the first few steps during the sampling process, and then affect the final generation result through error propagation. 4. **Experimental verification**: - The experimental results show that the proposed weighting strategy not only significantly improves the sample quality but also shows higher efficiency in both the training and sampling processes. - Compared with the existing weighting strategies, the new method can achieve better performance with fewer iteration times and sampling steps. In summary, through theoretical analysis and experimental verification, this paper shows that the constant - weight strategy in the training of traditional diffusion models will lead to bias problems, and proposes a new weighting strategy to solve these problems, thereby improving the performance and efficiency of diffusion models.

Unmasking Bias in Diffusion Model Training

Multi-Step Denoising Scheduled Sampling: Towards Alleviating Exposure Bias for Diffusion Models

InvDiff: Invariant Guidance for Bias Mitigation in Diffusion Models

Your Diffusion Model is Secretly a Noise Classifier and Benefits from Contrastive Training

RGB Images Enhancing Hyperspectral Image Denoising with Diffusion Model

Efficient Diffusion Training Via Min-SNR Weighting Strategy.

Stimulating Diffusion Model for Image Denoising via Adaptive Embedding and Ensembling

A Noise-Model-Free Hyperspectral Image Denoising Method Based on Diffusion Model.

Stimulating the Diffusion Model for Image Denoising Via Adaptive Embedding and Ensembling

Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy

Training Unbiased Diffusion Models From Biased Dataset

Reducing Spatial Fitting Error in Distillation of Denoising Diffusion Models

Training-free Diffusion Model Alignment with Sampling Demons

Diffusion Models With Learned Adaptive Noise

Immiscible Diffusion: Accelerating Diffusion Training with Noise Assignment

Diffusion Model for Generative Image Denoising

Erasing Undesirable Influence in Diffusion Models

Balancing Act: Distribution-Guided Debiasing in Diffusion Models

MIST: Mitigating Intersectional Bias with Disentangled Cross-Attention Editing in Text-to-Image Diffusion Models

Masked Diffusion Models Are Fast Distribution Learners

Training Diffusion Models with Reinforcement Learning