Abstract:High computational overhead is a troublesome problem for diffusion models. Recent studies have leveraged post-training quantization (PTQ) to compress diffusion models. However, most of them only focus on unconditional models, leaving the quantization of widely-used pretrained text-to-image models, e.g., Stable Diffusion, largely unexplored. In this paper, we propose a novel post-training quantization method PCR (Progressive Calibration and Relaxing) for text-to-image diffusion models, which consists of a progressive calibration strategy that considers the accumulated quantization error across timesteps, and an activation relaxing strategy that improves the performance with negligible cost. Additionally, we demonstrate the previous metrics for text-to-image diffusion model quantization are not accurate due to the distribution gap. To tackle the problem, we propose a novel QDiffBench benchmark, which utilizes data in the same domain for more accurate evaluation. Besides, QDiffBench also considers the generalization performance of the quantized model outside the calibration dataset. Extensive experiments on Stable Diffusion and Stable Diffusion XL demonstrate the superiority of our method and benchmark. Moreover, we are the first to achieve quantization for Stable Diffusion XL while maintaining the performance.

What problem does this paper attempt to address?

### Problems the Paper Attempts to Solve This paper aims to address the high computational cost encountered during the quantization process of text-to-image diffusion models. Specifically: 1. **High Computational Cost**: Diffusion models require multiple denoising steps to generate images, leading to high time and memory consumption. This is particularly significant for large-scale pre-trained models such as Stable Diffusion and Stable Diffusion XL. 2. **Limitations of Quantization Methods**: Existing quantization methods mainly focus on unconditional diffusion models, while the quantization research on widely used pre-trained text-to-image models (such as Stable Diffusion) is relatively scarce. Additionally, existing quantization methods overlook accumulated quantization errors and the sensitivity of different denoising steps to image fidelity or text-image matching. 3. **Inaccuracy of Evaluation Metrics**: Current evaluation metrics (such as FID) cannot accurately assess the performance of quantized models due to the distribution gap problem. To address these issues, the authors propose a new post-training quantization method called PCR (Progressive Calibration and Relaxation) and a comprehensive benchmark QDiffBench for evaluating the quantization effects of text-to-image diffusion models. The specific contributions are as follows: 1. **Proposing the PCR Method**: Including a progressive calibration strategy and an activation relaxation strategy, which can effectively reduce accumulated quantization errors and improve performance with almost no additional computational cost. 2. **Proposing the QDiffBench Benchmark**: Including accurate FID calculation strategies and generalization ability evaluation strategies, which can more accurately assess the performance of quantized models. 3. **Extensive Experimental Validation**: A large number of experiments on foundational diffusion models such as Stable Diffusion and Stable Diffusion XL demonstrate the superiority of the proposed methods and benchmarks. 4. **First Quantization of Stable Diffusion XL**: This is one of the largest diffusion models to date, with 350 million parameters. Through these contributions, the paper provides new solutions and evaluation standards for the efficient quantization of text-to-image diffusion models.

Post-training Quantization for Text-to-Image Diffusion Models with Progressive Calibration and Activation Relaxing

Post-training Quantization with Progressive Calibration and Activation Relaxing for Text-to-Image Diffusion Models.

Efficiency Meets Fidelity: A Novel Quantization Framework for Stable Diffusion

Q-Diffusion: Quantizing Diffusion Models

Towards Accurate Post-training Quantization for Diffusion Models

PQD: Post-training Quantization for Efficient Diffusion Models

PTQD: Accurate Post-Training Quantization for Diffusion Models

TCAQ-DM: Timestep-Channel Adaptive Quantization for Diffusion Models

Accurate Compression of Text-to-Image Diffusion Models via Vector Quantization

Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers

QVD: Post-training Quantization for Video Diffusion Models

QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning

TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models

QNCD: Quantization Noise Correction for Diffusion Models

DilateQuant: Accurate and Efficient Diffusion Quantization via Weight Dilation

MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization

Temporal Feature Matters: A Framework for Diffusion Model Quantization

EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diffusion Models

An Analysis on Quantizing Diffusion Transformers

EDA-DM: Enhanced Distribution Alignment for Post-Training Quantization of Diffusion Models

Timestep-Aware Correction for Quantized Diffusion Models