Diffusion Tuning: Transferring Diffusion Models via Chain of Forgetting

Jincheng Zhong,Xingzhuo Guo,Jiaxiang Dong,Mingsheng Long
2024-06-06
Abstract:Diffusion models have significantly advanced the field of generative modeling. However, training a diffusion model is computationally expensive, creating a pressing need to adapt off-the-shelf diffusion models for downstream generation tasks. Current fine-tuning methods focus on parameter-efficient transfer learning but overlook the fundamental transfer characteristics of diffusion models. In this paper, we investigate the transferability of diffusion models and observe a monotonous chain of forgetting trend of transferability along the reverse process. Based on this observation and novel theoretical insights, we present Diff-Tuning, a frustratingly simple transfer approach that leverages the chain of forgetting tendency. Diff-Tuning encourages the fine-tuned model to retain the pre-trained knowledge at the end of the denoising chain close to the generated data while discarding the other noise side. We conduct comprehensive experiments to evaluate Diff-Tuning, including the transfer of pre-trained Diffusion Transformer models to eight downstream generations and the adaptation of Stable Diffusion to five control conditions with ControlNet. Diff-Tuning achieves a 26% improvement over standard fine-tuning and enhances the convergence speed of ControlNet by 24%. Notably, parameter-efficient transfer learning techniques for diffusion models can also benefit from Diff-Tuning.
Machine Learning,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper aims to address the fine-tuning problem of diffusion models in the field of generative models. Specifically, training a brand-new diffusion model requires a significant amount of computational resources, so how to effectively adapt a pre-trained diffusion model to specific downstream tasks has become a key issue. Current fine-tuning methods mainly focus on parameter-efficient transfer learning but overlook the transfer characteristics of the diffusion model itself. Through research, the authors discovered a monotonic chain of forgetting trend in the reverse process of diffusion models and proposed the Diff-Tuning method based on this observation. Diff-Tuning is a simple and effective fine-tuning method that leverages the trend of the chain of forgetting, encouraging the fine-tuned model to retain pre-trained knowledge at the end of the denoising process while discarding other noisy parts. This method has shown significantly better performance than standard fine-tuning in multiple experiments, particularly in tasks of conditional generation and controllable generation using ControlNet, with performance improvements of 26% and 24%, respectively. Additionally, Diff-Tuning can enhance the performance of existing parameter-efficient transfer learning techniques. Overall, the paper demonstrates the effectiveness of its method through theoretical analysis and experiments, showcasing superior performance on various datasets.