Accelerated Image-Aware Generative Diffusion Modeling

Tanmay Asthana,Yufang Bao,Hamid Krim
2024-08-16
Abstract:We propose in this paper an analytically new construct of a diffusion model whose drift and diffusion parameters yield an exponentially time-decaying Signal to Noise Ratio in the forward process. In reverse, the construct cleverly carries out the learning of the diffusion coefficients on the structure of clean images using an autoencoder. The proposed methodology significantly accelerates the diffusion process, reducing the required diffusion time steps from around 1000 seen in conventional models to 200-500 without compromising image quality in the reverse-time diffusion. In a departure from conventional models which typically use time-consuming multiple runs, we introduce a parallel data-driven model to generate a reverse-time diffusion trajectory in a single run of the model. The resulting collective block-sequential generative model eliminates the need for MCMC-based sub-sampling correction for safeguarding and improving image quality, to further improve the acceleration of image generation. Collectively, these advancements yield a generative model that is an order of magnitude faster than conventional approaches, while maintaining high fidelity and diversity in generated images, hence promising widespread applicability in rapid image synthesis tasks.
Image and Video Processing
What problem does this paper attempt to address?
The main problems that this paper attempts to solve are the slow convergence speed and high computational complexity in the image generation process of traditional generative diffusion models. Specifically: 1. **Problems of traditional models**: - **Slow convergence**: Traditional generative diffusion models require a large number of time steps (usually around 1,000 steps) to generate high - quality images, which leads to high computational complexity. - **Low computational efficiency**: In order to ensure image quality, traditional models usually need to be run multiple times, which further increases the computational burden. 2. **Proposed new methods**: - **Accelerating the diffusion process**: The paper proposes a new diffusion model. By optimizing the drift and diffusion parameters in the forward diffusion process, the signal - to - noise ratio (SNR) decays exponentially over time. This method significantly reduces the required diffusion time steps, from the traditional 1,000 steps to 200 - 500 steps, while maintaining image quality. - **Auto - encoder learning**: In the reverse diffusion process, the auto - encoder is used to learn the structure of clean images, thus generating reverse diffusion trajectories more effectively. This method not only improves the generation speed but also maintains the high - fidelity and diversity of the images. - **Single - run generation**: Unlike traditional models that need to be run multiple times, the new method generates a complete reverse diffusion trajectory in one run by parallel data - driven models, eliminating the need for MCMC sub - sampling correction and further accelerating the image generation process. 3. **Main contributions**: - **Accelerated generation**: The generation speed of the new model is an order of magnitude faster than that of traditional methods while maintaining image quality and diversity. - **Optimized scheduling strategy**: Through the pixel - level scheduling strategy, the forward diffusion process is optimized, making all pixels reach the isotropic Gaussian distribution more quickly. - **Application of auto - encoder**: The auto - encoder is used to learn the global image structure and generate efficient reverse diffusion trajectories, further improving the generation efficiency. In conclusion, by introducing a new diffusion model and an optimized scheduling strategy, this paper significantly improves the convergence speed and computational efficiency of generative diffusion models, providing a new solution for fast image synthesis tasks.