Lecture Notes in Probabilistic Diffusion Models

Inga Strümke,Helge Langseth
2023-12-16
Abstract:Diffusion models are loosely modelled based on non-equilibrium thermodynamics, where \textit{diffusion} refers to particles flowing from high-concentration regions towards low-concentration regions. In statistics, the meaning is quite similar, namely the process of transforming a complex distribution $p_{\text{complex}}$ on $\mathbb{R}^d$ to a simple distribution $p_{\text{prior}}$ on the same domain. This constitutes a Markov chain of diffusion steps of slowly adding random noise to data, followed by a reverse diffusion process in which the data is reconstructed from the noise. The diffusion model learns the data manifold to which the original and thus the reconstructed data samples belong, by training on a large number of data points. While the diffusion process pushes a data sample off the data manifold, the reverse process finds a trajectory back to the data manifold. Diffusion models have -- unlike variational autoencoder and flow models -- latent variables with the same dimensionality as the original data, and they are currently\footnote{At the time of writing, 2023.} outperforming other approaches -- including Generative Adversarial Networks (GANs) -- to modelling the distribution of, e.g., natural images.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper primarily explores the fundamental mathematical principles of Probabilistic Diffusion Models and their application in generative modeling. Specifically: 1. **Description of the Diffusion Process**: - The paper provides a detailed introduction to the forward diffusion process, which describes how complex data distributions are gradually transformed into simple prior distributions by adding noise. - During the forward diffusion process, data samples progressively increase noise through a series of steps, eventually becoming a simple and easily handled Gaussian distribution. 2. **Reverse Diffusion Process**: - The study investigates how to gradually remove noise through the reverse process, thereby recovering the original data samples from the simple distribution. - The reverse process also follows the form of a Gaussian distribution and can learn the parameters of the reverse process by training neural networks. 3. **Generative Modeling**: - The reverse diffusion process is used to achieve generative modeling, which involves gradually removing noise to ultimately obtain samples that conform to the original data distribution. - Unlike Variational Autoencoders (VAE) and Flow Models, the latent variable dimension of diffusion models is the same as the original data, making them perform well in generating natural images and other tasks. 4. **Loss Function**: - The paper discusses how to define the loss function of neural networks to optimize the performance of the generative model. - Since directly calculating the log-likelihood function is infeasible, the paper proposes using the variational lower bound to approximate the maximum likelihood estimation. Through the above research, the paper aims to provide a fundamental mathematical framework for diffusion models and demonstrate their superior performance in generative modeling.