Diffusion Model Patching via Mixture-of-Prompts

Seokil Ham,Sangmin Woo,Jin-Young Kim,Hyojun Go,Byeongjun Park,Changick Kim
2024-05-30
Abstract:We present Diffusion Model Patching (DMP), a simple method to boost the performance of pre-trained diffusion models that have already reached convergence, with a negligible increase in parameters. DMP inserts a small, learnable set of prompts into the model's input space while keeping the original model frozen. The effectiveness of DMP is not merely due to the addition of parameters but stems from its dynamic gating mechanism, which selects and combines a subset of learnable prompts at every step of the generative process (e.g., reverse denoising steps). This strategy, which we term "mixture-of-prompts", enables the model to draw on the distinct expertise of each prompt, essentially "patching" the model's functionality at every step with minimal yet specialized parameters. Uniquely, DMP enhances the model by further training on the same dataset on which it was originally trained, even in a scenario where significant improvements are typically not expected due to model convergence. Experiments show that DMP significantly enhances the converged FID of DiT-L/2 on FFHQ 256x256 by 10.38%, achieved with only a 1.43% parameter increase and 50K additional training iterations.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
The paper aims to address the issue of performance improvement for pre-trained diffusion models that have already converged. Specifically, the paper proposes a method called "Diffusion Model Patching (DMP)," which enhances model performance by inserting a set of small, learnable prompts in the input space and dynamically selecting and combining these prompts during the generation process. This method does not require modifying the original model parameters or significantly increasing the number of parameters. Instead, it achieves specialized optimization for different denoising stages through a "mixed prompts" strategy. Experimental results show that DMP can significantly improve the performance of already converged models on the same dataset without a substantial increase in parameters. For example, on the FFHQ 256×256 dataset, DMP can improve the FID score of the DiT-L/2 model by 10.38%, requiring only a 1.43% increase in parameters and an additional 50K iterations of training.