Abstract:Generative modeling has recently undergone remarkable advancements, primarily propelled by the transformative implications of Diffusion Probabilistic Models (DPMs). The impressive capability of these models, however, often entails significant computational overhead during both training and inference. To tackle this challenge, we present Diff-Pruning, an efficient compression method tailored for learning lightweight diffusion models from pre-existing ones, without the need for extensive re-training. The essence of Diff-Pruning is encapsulated in a Taylor expansion over pruned timesteps, a process that disregards non-contributory diffusion steps and ensembles informative gradients to identify important weights. Our empirical assessment, undertaken across several datasets highlights two primary benefits of our proposed method: 1) Efficiency: it enables approximately a 50\% reduction in FLOPs at a mere 10\% to 20\% of the original training expenditure; 2) Consistency: the pruned diffusion models inherently preserve generative behavior congruent with their pre-trained models. Code is available at \url{<a class="link-external link-https" href="https://github.com/VainF/Diff-Pruning" rel="external noopener nofollow">this https URL</a>}.

What problem does this paper attempt to address?

The main problem this paper attempts to address is the high computational cost of Diffusion Probabilistic Models (DPMs) during training and inference. Although these models perform excellently in generative tasks, their high computational cost limits their widespread application in resource-constrained environments. To tackle this challenge, the authors propose an efficient compression method called **Diff-Pruning**, which aims to learn lightweight models from existing pre-trained diffusion models without extensive retraining. ### Main Contributions: 1. **Efficiency Improvement**: Through structured pruning techniques, Diff-Pruning can significantly reduce the number of floating-point operations (FLOPs), achieving approximately 50% FLOPs reduction with only 10% to 20% of the original model's training cost. 2. **Consistency Maintenance**: The pruned diffusion models can maintain generative behavior consistent with the pre-trained models, ensuring that the generation quality is not significantly affected. ### Method Overview: - **Taylor Expansion**: The core idea is to perform Taylor expansion at the pruning time steps, identifying important weights by ignoring non-contributive diffusion steps and integrating information gradients. - **Importance Assessment**: The importance of each parameter is estimated through Taylor expansion, balancing the impact of image content, details, and noise. - **Threshold Strategy**: A threshold parameter \( T \) is introduced to select important time steps based on the relative loss \( \frac{L_t}{L_{\text{max}}} \), thereby avoiding the accumulation of noise gradients. ### Experimental Results: - Experiments on multiple datasets show that Diff-Pruning can significantly reduce computational costs while maintaining or even improving generation quality. - For example, on the LSUN Church dataset, using Diff-Pruning can reduce FLOPs by 50%, with training costs only 10% of the original model, equivalent to 0.5 million steps compared to 4.4 million steps. - The compressed models maintain generative behavior consistent with the pre-trained models, ensuring the practicality and reliability of the method. ### Summary: This paper introduces Diff-Pruning, an efficient diffusion model compression method that can significantly reduce computational costs while maintaining generation quality. This method provides a foundation for future research, helping to further improve the quality and consistency of compressed diffusion models.

Structural Pruning for Diffusion Models

Effortless Efficiency: Low-Cost Pruning of Diffusion Models

Pruning then Reweighting: Towards Data-Efficient Training of Diffusion Models

LD-Pruner: Efficient Pruning of Latent Diffusion Models using Task-Agnostic Insights

Structured Probabilistic Pruning for Convolutional Neural Network Acceleration.

DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization

Data Pruning in Generative Diffusion Models

TinyFusion: Diffusion Transformers Learned Shallow

Structured Pruning Learns Compact and Accurate Models

LAPTOP-Diff: Layer Pruning and Normalized Distillation for Compressing Diffusion Models

Decay Pruning Method: Smooth Pruning With a Self-Rectifying Procedure

Length-Adaptive Distillation: Customizing Small Language Model for Dynamic Token Pruning.

Bypass Back-propagation: Optimization-based Structural Pruning for Large Language Models via Policy Gradient

Mixture of Efficient Diffusion Experts Through Automatic Interval and Sub-Network Selection

Improving Efficiency of Diffusion Models via Multi-Stage Framework and Tailored Multi-Decoder Architectures

Not All Prompts Are Made Equal: Prompt-based Pruning of Text-to-Image Diffusion Models

DepGraph: Towards Any Structural Pruning

Diffusion Probabilistic Model Made Slim

Towards Faster Training of Diffusion Models: An Inspiration of A Consistency Phenomenon

DRPruning: Efficient Large Language Model Pruning through Distributionally Robust Optimization

Sparse Training of Discrete Diffusion Models for Graph Generation