Continual Learning of Diffusion Models with Generative Distillation

Sergi Masip,Pau Rodriguez,Tinne Tuytelaars,Gido M. van de Ven

2024-05-21

Abstract:Diffusion models are powerful generative models that achieve state-of-the-art performance in image synthesis. However, training them demands substantial amounts of data and computational resources. Continual learning would allow for incrementally learning new tasks and accumulating knowledge, thus enabling the reuse of trained models for further learning. One potentially suitable continual learning approach is generative replay, where a copy of a generative model trained on previous tasks produces synthetic data that are interleaved with data from the current task. However, standard generative replay applied to diffusion models results in a catastrophic loss in denoising capabilities. In this paper, we propose generative distillation, an approach that distils the entire reverse process of a diffusion model. We demonstrate that our approach substantially improves the continual learning performance of generative replay with only a modest increase in the computational costs.

Machine Learning,Artificial Intelligence,Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

This paper discusses the problem of continual learning in diffusion models. Diffusion models perform well in image generation tasks but have high training costs and tend to forget previously learned knowledge. To address this issue, the paper proposes a method called "Generative Distillation". In standard generative replay, the model generates samples from old tasks for training on new tasks. However, this method severely compromises the denoising ability of diffusion models. The paper found that using DDIM samplers can speed up the generation process but quickly decreases the image quality. To improve this problem, the paper introduces knowledge distillation into generative replay, i.e., "Generative Distillation". This method distills the reverse process of the student model through the entire reverse process of the teacher model, transferring knowledge not only at the endpoint of generating samples but also throughout the entire process. Experiments show that this method significantly improves continual learning performance, mitigates forgetting, and enhances denoising ability while maintaining similar computational costs as generative replay. In addition, the paper compares the influence of different amounts of DDIM steps on the results and compares it with other forms of knowledge distillation, demonstrating the effectiveness of generative distillation. Overall, the paper aims to improve the efficiency of diffusion models, enabling them to continuously learn and accumulate knowledge under limited resources.

Continual Learning of Diffusion Models with Generative Distillation

DDGR: Continual Learning with Deep Diffusion-based Generative Replay.

Class-Incremental Learning using Diffusion Model for Distillation and Replay

Joint Diffusion models in Continual Learning

Distilling Diffusion Models into Conditional GANs

Latent Dataset Distillation with Diffusion Models

Continual Learning with Diffusion-based Generative Replay for Industrial Streaming Data

One-Step Diffusion Distillation via Deep Equilibrium Models

DDIL: Improved Diffusion Distillation With Imitation Learning

EM Distillation for One-step Diffusion Models

Where to Diffuse, How to Diffuse, and How to Get Back: Automated Learning for Multivariate Diffusions

Plug-and-Play Diffusion Distillation

Accelerating Video Diffusion Models via Distribution Matching

Transfer Learning for Diffusion Models

Using Diffusion Models as Generative Replay in Continual Federated Learning -- What will Happen?

Generative Dataset Distillation Based on Diffusion Model

Adv-KD: Adversarial Knowledge Distillation for Faster Diffusion Sampling

Distillation of Discrete Diffusion through Dimensional Correlations

Continual Offline Reinforcement Learning via Diffusion-based Dual Generative Replay

Training Diffusion Models with Reinforcement Learning

CoDi: Conditional Diffusion Distillation for Higher-Fidelity and Faster Image Generation