Abstract:Accelerating the sampling speed of diffusion models remains a significant challenge. Recent score distillation methods distill a heavy teacher model into a student generator to achieve one-step generation, which is optimized by calculating the difference between the two score functions on the samples generated by the student model. However, there is a score mismatch issue in the early stage of the distillation process, because existing methods mainly focus on using the endpoint of pre-trained diffusion models as teacher models, overlooking the importance of the convergence trajectory between the student generator and the teacher model. To address this issue, we extend the score distillation process by introducing the entire convergence trajectory of teacher models and propose Distribution Backtracking Distillation (DisBack). DisBask is composed of two stages: Degradation Recording and Distribution Backtracking. Degradation Recording is designed to obtain the convergence trajectory of the teacher model, which records the degradation path from the trained teacher model to the untrained initial student generator. The degradation path implicitly represents the teacher model's intermediate distributions, and its reverse can be viewed as the convergence trajectory from the student generator to the teacher model. Then Distribution Backtracking trains a student generator to backtrack the intermediate distributions along the path to approximate the convergence trajectory of teacher models. Extensive experiments show that DisBack achieves faster and better convergence than the existing distillation method and accomplishes comparable generation performance, with FID score of 1.38 on ImageNet 64x64 dataset. Notably, DisBack is easy to implement and can be generalized to existing distillation methods to boost performance. Our code is publicly available on <a class="link-external link-https" href="https://github.com/SYZhang0805/DisBack" rel="external noopener nofollow">this https URL</a>.

DDIL: Improved Diffusion Distillation With Imitation Learning

SFDDM: Single-fold Distillation for Diffusion models

Latent Dataset Distillation with Diffusion Models

Continual Learning of Diffusion Models with Generative Distillation

Relational Diffusion Distillation for Efficient Image Generation

DCD: Discriminative and Consistent Representation Distillation

Physics Informed Distillation for Diffusion Models

Knowledge Diffusion for Distillation

Distillation of Discrete Diffusion through Dimensional Correlations

EM Distillation for One-step Diffusion Models

DFGPD: a new distillation framework with global and positional distillation

Efficient Dataset Distillation via Diffusion-Driven Patch Selection for Improved Generalization

Diff-Instruct: A Universal Approach for Transferring Knowledge From Pre-trained Diffusion Models

Distribution Backtracking Builds A Faster Convergence Trajectory for Diffusion Distillation

One-Step Diffusion Distillation via Deep Equilibrium Models

DCD: A New Framework for Distillation Learning With Dynamic Curriculum.

CoDi: Conditional Diffusion Distillation for Higher-Fidelity and Faster Image Generation

DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents

Reducing Spatial Fitting Error in Distillation of Denoising Diffusion Models

Adaptive Distillation for Decentralized Learning from Heterogeneous Clients

Towards Effective Data-Free Knowledge Distillation via Diverse Diffusion Augmentation