Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation

Axel Sauer,Frederic Boesel,Tim Dockhorn,Andreas Blattmann,Patrick Esser,Robin Rombach
2024-03-19
Abstract:Diffusion models are the main driver of progress in image and video synthesis, but suffer from slow inference speed. Distillation methods, like the recently introduced adversarial diffusion distillation (ADD) aim to shift the model from many-shot to single-step inference, albeit at the cost of expensive and difficult optimization due to its reliance on a fixed pretrained DINOv2 discriminator. We introduce Latent Adversarial Diffusion Distillation (LADD), a novel distillation approach overcoming the limitations of ADD. In contrast to pixel-based ADD, LADD utilizes generative features from pretrained latent diffusion models. This approach simplifies training and enhances performance, enabling high-resolution multi-aspect ratio image synthesis. We apply LADD to Stable Diffusion 3 (8B) to obtain SD3-Turbo, a fast model that matches the performance of state-of-the-art text-to-image generators using only four unguided sampling steps. Moreover, we systematically investigate its scaling behavior and demonstrate LADD's effectiveness in various applications such as image editing and inpainting.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The main problem addressed in this paper is to improve the speed of diffusion models in image and video generation, especially to reduce the inference time for high-resolution image synthesis. The current methods require multiple network evaluations, while the paper proposes a new distillation method - Latent-Adversarial Diffusion Distillation (LADD), which uses the generative features of a pre-trained diffusion model to simplify training and improve performance. LADD enables the model to achieve state-of-the-art level in real-time text-to-image generation with four-step unguided sampling, while also being applicable to image editing and restoration applications.