Structural Pruning for Diffusion Models

Gongfan Fang,Xinyin Ma,Xinchao Wang
2023-09-30
Abstract:Generative modeling has recently undergone remarkable advancements, primarily propelled by the transformative implications of Diffusion Probabilistic Models (DPMs). The impressive capability of these models, however, often entails significant computational overhead during both training and inference. To tackle this challenge, we present Diff-Pruning, an efficient compression method tailored for learning lightweight diffusion models from pre-existing ones, without the need for extensive re-training. The essence of Diff-Pruning is encapsulated in a Taylor expansion over pruned timesteps, a process that disregards non-contributory diffusion steps and ensembles informative gradients to identify important weights. Our empirical assessment, undertaken across several datasets highlights two primary benefits of our proposed method: 1) Efficiency: it enables approximately a 50\% reduction in FLOPs at a mere 10\% to 20\% of the original training expenditure; 2) Consistency: the pruned diffusion models inherently preserve generative behavior congruent with their pre-trained models. Code is available at \url{<a class="link-external link-https" href="https://github.com/VainF/Diff-Pruning" rel="external noopener nofollow">this https URL</a>}.
Machine Learning,Artificial Intelligence,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The main problem this paper attempts to address is the high computational cost of Diffusion Probabilistic Models (DPMs) during training and inference. Although these models perform excellently in generative tasks, their high computational cost limits their widespread application in resource-constrained environments. To tackle this challenge, the authors propose an efficient compression method called **Diff-Pruning**, which aims to learn lightweight models from existing pre-trained diffusion models without extensive retraining. ### Main Contributions: 1. **Efficiency Improvement**: Through structured pruning techniques, Diff-Pruning can significantly reduce the number of floating-point operations (FLOPs), achieving approximately 50% FLOPs reduction with only 10% to 20% of the original model's training cost. 2. **Consistency Maintenance**: The pruned diffusion models can maintain generative behavior consistent with the pre-trained models, ensuring that the generation quality is not significantly affected. ### Method Overview: - **Taylor Expansion**: The core idea is to perform Taylor expansion at the pruning time steps, identifying important weights by ignoring non-contributive diffusion steps and integrating information gradients. - **Importance Assessment**: The importance of each parameter is estimated through Taylor expansion, balancing the impact of image content, details, and noise. - **Threshold Strategy**: A threshold parameter \( T \) is introduced to select important time steps based on the relative loss \( \frac{L_t}{L_{\text{max}}} \), thereby avoiding the accumulation of noise gradients. ### Experimental Results: - Experiments on multiple datasets show that Diff-Pruning can significantly reduce computational costs while maintaining or even improving generation quality. - For example, on the LSUN Church dataset, using Diff-Pruning can reduce FLOPs by 50%, with training costs only 10% of the original model, equivalent to 0.5 million steps compared to 4.4 million steps. - The compressed models maintain generative behavior consistent with the pre-trained models, ensuring the practicality and reliability of the method. ### Summary: This paper introduces Diff-Pruning, an efficient diffusion model compression method that can significantly reduce computational costs while maintaining generation quality. This method provides a foundation for future research, helping to further improve the quality and consistency of compressed diffusion models.