Abstract:Diffusion models have become fundamental tools for modeling data distributions in machine learning and have applications in image generation, drug discovery, and audio synthesis. Despite their success, these models face challenges when generating data with extreme brightness values, as evidenced by limitations in widely used frameworks like Stable Diffusion. Offset noise has been proposed as an empirical solution to this issue, yet its theoretical basis remains insufficiently explored. In this paper, we propose a generalized diffusion model that naturally incorporates additional noise within a rigorous probabilistic framework. Our approach modifies both the forward and reverse diffusion processes, enabling inputs to be diffused into Gaussian distributions with arbitrary mean structures. We derive a loss function based on the evidence lower bound, establishing its theoretical equivalence to offset noise with certain adjustments, while broadening its applicability. Experiments on synthetic datasets demonstrate that our model effectively addresses brightness-related challenges and outperforms conventional methods in high-dimensional scenarios.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is the challenges encountered by existing diffusion models when generating images with extreme brightness values. Specifically, although diffusion models have achieved success in many fields, such as image generation, drug discovery, and audio synthesis, they perform poorly when dealing with images with extremely low or high brightness (for example, completely black or completely white images). This problem is particularly evident in widely - used frameworks such as Stable Diffusion. To address this challenge, researchers have proposed offset noise as an empirical solution. However, the theoretical basis of offset noise has not been fully explored, resulting in its incomplete compatibility with the existing theoretical framework of diffusion models, thus raising concerns about whether the use of offset noise deviates from the original theoretical framework of diffusion models. To solve these problems, this paper proposes a Generalized Diffusion Model. This model modifies the forward and reverse diffusion processes by introducing an additional noise term and naturally integrating it into a strict probability framework. This improvement enables the input data to be diffused into a Gaussian distribution with an arbitrary mean structure, thereby more effectively solving brightness - related problems and outperforming traditional methods in high - dimensional scenarios. ### Main contributions: 1. **New loss function**: The form of the loss function derived by this model is similar to that of the offset noise model with adjustment. The difference is that the additional noise term is added to the standard normal noise after being multiplied by a time - dependent coefficient. 2. **Generalize traditional diffusion models**: This model allows the input data to be diffused into a Gaussian distribution with an arbitrary mean structure, including the traditional zero - mean Gaussian distribution as a special case. 3. **Theoretical compatibility**: Since this model is based on an explicit probability framework, it ensures theoretical compatibility with other diffusion model methods, especially in combination with the v - prediction framework. 4. **Experimental evidence**: Experiments on synthetic datasets show that this model performs excellently in dealing with the image brightness problem uniformly distributed between pure black and pure white, and in particular, it outperforms traditional methods in high - dimensional data settings. ### Summary of mathematical formulas: - Noise term in the forward process: \[ q(x_t | x_{t - 1}, \xi) = \mathcal{N}\left(x_t \mid \sqrt{1 - \beta_t}(x_{t - 1} + \gamma_t \xi), \beta_t \sigma_0^2 I\right) \] - Loss function: \[ \ell(\theta; x_0) = \mathbb{E}_{q(\xi), U(t|1,T), \mathcal{N}(\epsilon_0|0,I)} \left[ \lambda_t \left\| \sigma_0 \epsilon_0 + \phi_t \xi - \epsilon_\theta\left(\sqrt{\bar{\alpha}_t} x_0 + \sqrt{1 - \bar{\alpha}_t} (\sigma_0 \epsilon_0 + \psi_t \xi), t\right) \right\|^2 \right] \] where \(\lambda_t\) is given by formula (11), and \(\phi_t\) and \(\psi_t\) are given by formulas (21) and (22) respectively. Through these improvements, this paper not only solves the deficiencies of existing diffusion models in generating extreme - brightness images but also provides a more solid theoretical foundation, enabling it to be better combined with other techniques.

Generalized Diffusion Model with Adjusted Offset Noise

Where to Diffuse, How to Diffuse, and How to Get Back: Automated Learning for Multivariate Diffusions

On the Generalization Properties of Diffusion Models

Blackout Diffusion: Generative Diffusion Models in Discrete-State Spaces

Removing Structured Noise with Diffusion Models

GUD: Generation with Unified Diffusion

Not All Noises Are Created Equally:Diffusion Noise Selection and Optimization

Observation-Guided Diffusion Probabilistic Models

Diffusion Models With Learned Adaptive Noise

Non-Normal Diffusion Models

Edge-preserving noise for diffusion models

Mix-DDPM: Enhancing Diffusion Models Through Fitting Mixture Noise with Global Stochastic Offset

Physics-Informed Diffusion Models

On the Generalization of Diffusion Model

Blurring Diffusion Models

Convergence of Diffusion Models Under the Manifold Hypothesis in High-Dimensions

Fast Diffusion Model For Seismic Data Noise Attenuation

Heavy-Tailed Diffusion Models

Exploring the Optimal Choice for Generative Processes in Diffusion Models: Ordinary vs Stochastic Differential Equations

Consistent Diffusion Meets Tweedie: Training Exact Ambient Diffusion Models with Noisy Data