Diff-Instruct: A Universal Approach for Transferring Knowledge From Pre-trained Diffusion Models

Weijian Luo,Tianyang Hu,Shifeng Zhang,Jiacheng Sun,Zhenguo Li,Zhihua Zhang
2024-01-15
Abstract:Due to the ease of training, ability to scale, and high sample quality, diffusion models (DMs) have become the preferred option for generative modeling, with numerous pre-trained models available for a wide variety of datasets. Containing intricate information about data distributions, pre-trained DMs are valuable assets for downstream applications. In this work, we consider learning from pre-trained DMs and transferring their knowledge to other generative models in a data-free fashion. Specifically, we propose a general framework called Diff-Instruct to instruct the training of arbitrary generative models as long as the generated samples are differentiable with respect to the model parameters. Our proposed Diff-Instruct is built on a rigorous mathematical foundation where the instruction process directly corresponds to minimizing a novel divergence we call Integral Kullback-Leibler (IKL) divergence. IKL is tailored for DMs by calculating the integral of the KL divergence along a diffusion process, which we show to be more robust in comparing distributions with misaligned supports. We also reveal non-trivial connections of our method to existing works such as DreamFusion, and generative adversarial training. To demonstrate the effectiveness and universality of Diff-Instruct, we consider two scenarios: distilling pre-trained diffusion models and refining existing GAN models. The experiments on distilling pre-trained diffusion models show that Diff-Instruct results in state-of-the-art single-step diffusion-based models. The experiments on refining GAN models show that the Diff-Instruct can consistently improve the pre-trained generators of GAN models across various settings.
Machine Learning,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to transfer knowledge from pre - trained Diffusion Models (DMs) to other generative models, especially implicit generative models (such as GANs), and this process does not require the original training data. Specifically, the author proposes a general - purpose framework, Diff - Instruct, for guiding the training of any generative model, as long as the generated samples are differentiable with respect to the model parameters. This method is based on a rigorous mathematical foundation, namely minimizing a new divergence - the Integral Kullback - Leibler (IKL) divergence, which is customized by calculating the integral of the KL divergence along the diffusion process and is suitable for comparing distributions with inconsistent supports. ### Core Problems of the Paper 1. **Knowledge Transfer**: Can the knowledge of pre - trained diffusion models be transferred to other generative models instead of directly learning from the original training data? 2. **Technical Challenges**: For implicit models lacking explicit score information, how can they effectively receive supervision from the multi - level score networks of diffusion models? ### Solutions - **Diff - Instruct Framework**: A general - purpose framework, Diff - Instruct, is proposed, which can utilize pre - trained diffusion models to guide the training of implicit generative models. - **IKL Divergence**: The Integral Kullback - Leibler (IKL) divergence is introduced. By comparing distributions through the integral along the diffusion process, it solves the degeneracy problem of the traditional KL divergence when comparing distributions with inconsistent supports. - **Algorithm Implementation**: Knowledge transfer from pre - trained diffusion models to target generative models is achieved by alternately updating the generator parameters and the marginal score function. ### Experimental Verification - **Single - Step Diffusion Distillation**: On the CIFAR - 10 and ImageNet 64×64 datasets, a single - step generator is trained using the Diff - Instruct framework. The results show that the best - in - class performance is achieved in terms of the FID metric. - **Improving GAN Generators**: The pre - trained StyleGAN - 2 model is improved through the Diff - Instruct framework. Experiments show that the generative performance can be significantly enhanced. ### Main Contributions - **Generality**: The Diff - Instruct framework is applicable not only to single - step generative models but also to improving existing GAN models. - **Data - Free**: The entire knowledge transfer process does not require real data and is completely dependent on pre - trained diffusion models. - **Theoretical Basis**: The IKL divergence is proposed and its robustness in comparing distributions with inconsistent supports is proven. ### Conclusion By proposing the Diff - Instruct framework, this paper successfully solves the technical problem of transferring knowledge from pre - trained diffusion models to other generative models, especially its application to implicit generative models. The experimental results verify the effectiveness and generality of this method, providing a new direction for the research of generative models.