Accelerating Diffusion Models with One-to-Many Knowledge Distillation

Linfeng Zhang,Kaisheng Ma

2024-10-05

Abstract:Significant advancements in image generation have been made with diffusion models. Nevertheless, when contrasted with previous generative models, diffusion models face substantial computational overhead, leading to failure in real-time generation. Recent approaches have aimed to accelerate diffusion models by reducing the number of sampling steps through improved sampling techniques or step distillation. However, the methods to diminish the computational cost for each timestep remain a relatively unexplored area. Observing the fact that diffusion models exhibit varying input distributions and feature distributions at different timesteps, we introduce one-to-many knowledge distillation (O2MKD), which distills a single teacher diffusion model into multiple student diffusion models, where each student diffusion model is trained to learn the teacher's knowledge for a subset of continuous timesteps. Experiments on CIFAR10, LSUN Church, CelebA-HQ with DDPM and COCO30K with Stable Diffusion show that O2MKD can be applied to previous knowledge distillation and fast sampling methods to achieve significant acceleration. Codes will be released in Github.

Computer Vision and Pattern Recognition,Artificial Intelligence

What problem does this paper attempt to address?

The paper primarily addresses the issue of high computational overhead in diffusion models for real-time image generation. Specifically: 1. **Problem Background**: Although diffusion models perform excellently in image generation, their high computational complexity during the iterative denoising process results in poor real-time generation performance, limiting their deployment in edge devices and interactive applications. 2. **Limitations of Existing Methods**: Current methods to accelerate diffusion models mainly focus on reducing the number of sampling steps, such as by improving sampling techniques or step distillation. However, there is little research on reducing the computational cost within each time step. 3. **Proposed New Method**: The paper introduces a "One-to-Many Knowledge Distillation" (O2MKD) method, which distills the knowledge of a teacher model into multiple student models, with each student model focusing on learning the teacher model's knowledge within a specific subset of time periods. This method reduces the learning difficulty for each student model by decomposing the task into multiple sub-tasks, thereby improving image generation quality. 4. **Experimental Validation**: Experimental results on multiple datasets (such as CIFAR10, LSUN Church, CelebA-HQ, and COCO30K) show that O2MKD can significantly accelerate the operation of diffusion models and outperform traditional knowledge distillation methods in terms of image fidelity. Additionally, O2MKD has the advantage of being compatible with other acceleration techniques, such as DDIM. In summary, the paper aims to address the low computational efficiency of diffusion models through the O2MKD method to achieve faster and higher-quality image generation.

Accelerating Diffusion Models with One-to-Many Knowledge Distillation

Relational Diffusion Distillation for Efficient Image Generation

Adv-KD: Adversarial Knowledge Distillation for Faster Diffusion Sampling

DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture

Towards Effective Data-Free Knowledge Distillation via Diverse Diffusion Augmentation

Knowledge Diffusion for Distillation

Multi-student Diffusion Distillation for Better One-step Generators

One-step Diffusion with Distribution Matching Distillation

Data-Free Adversarial Distillation

Up to 100x Faster Data-Free Knowledge Distillation

SFDDM: Single-fold Distillation for Diffusion models

A Comprehensive Survey on Knowledge Distillation of Diffusion Models

Accelerated Image-Aware Generative Diffusion Modeling

Diffusion Models Are Innate One-Step Generators

Simple and Fast Distillation of Diffusion Models

Knowledge Distillation with Feature Maps for Image Classification

Collaborative Knowledge Distillation Via Multiknowledge Transfer.

A Novel Framework for Online Knowledge Distillation

Latent Dataset Distillation with Diffusion Models

Small Scale Data-Free Knowledge Distillation

Is Synthetic Data From Diffusion Models Ready for Knowledge Distillation?