Non-Denoising Forward-Time Diffusions

Stefano Peluchetti
2023-12-22
Abstract:The scope of this paper is generative modeling through diffusion processes. An approach falling within this paradigm is the work of Song et al. (2021), which relies on a time-reversal argument to construct a diffusion process targeting the desired data distribution. We show that the time-reversal argument, common to all denoising diffusion probabilistic modeling proposals, is not necessary. We obtain diffusion processes targeting the desired data distribution by taking appropriate mixtures of diffusion bridges. The resulting transport is exact by construction, allows for greater flexibility in choosing the dynamics of the underlying diffusion, and can be approximated by means of a neural network via novel training objectives. We develop a unifying view of the drift adjustments corresponding to our and to time-reversal approaches and make use of this representation to inspect the inner workings of diffusion-based generative models. Finally, we leverage on scalable simulation and inference techniques common in spatial statistics to move beyond fully factorial distributions in the underlying diffusion dynamics. The methodological advances contained in this work contribute toward establishing a general framework for generative modeling based on diffusion processes.
Machine Learning
What problem does this paper attempt to address?
This paper mainly explores the method of generating modeling through diffusion process, especially the improvement of Denoising Diffusion Probabilistic Modeling (DDPM). DDPM relies on time inversion inference to construct a diffusion process that transforms from a simple data independent distribution to a target data distribution. However, the paper proposes that time inversion inference is not necessary, and a diffusion process for the target data distribution can be directly constructed through diffusion bridges. This method is precise during construction and allows greater flexibility in choosing the underlying diffusion dynamics. In addition, it can be approximated by neural networks and provides new training objectives. The paper proposes Diffusion Bridge Mixture Transport (DBMT), which does not rely on time inversion but achieves the transformation from a data independent distribution to a target data distribution by mixing diffusion bridges with different starting and ending points. DBMT can be applied to almost any initial distribution, drift, and diffusion coefficient, and compared with DDPM, it has higher flexibility because it does not need to converge the diffusion process to a simple distribution. The paper also discusses how to implement DBMT and DTRT using the class of stochastic differential equations (SDE), and gives a unified representation of the drift adjustment for these two methods, which helps to understand the internal workings of diffusion-based generative models. In addition, the paper extends the SDE class that can be practically used in computer vision applications, utilizing scalable simulation and inference techniques in spatial statistics, considering more realistic diffusion transitions. Finally, the paper proposes training objectives for training neural networks to approximate the drift adjustment of DBMT, and demonstrates the application of DBMT in image generation tasks, showing good performance in terms of the distance between the generated sample distribution and the target data distribution.