Abstract:Low-bit quantization has become widespread for compressing image super-resolution (SR) models for edge deployment, which allows advanced SR models to enjoy compact low-bit parameters and efficient integer/bitwise constructions for storage compression and inference acceleration, respectively. However, it is notorious that low-bit quantization degrades the accuracy of SR models compared to their full-precision (FP) counterparts. Despite several efforts to alleviate the degradation, the transformer-based SR model still suffers severe degradation due to its distinctive activation distribution. In this work, we present a dual-stage low-bit post-training quantization (PTQ) method for image super-resolution, namely 2DQuant, which achieves efficient and accurate SR under low-bit quantization. The proposed method first investigates the weight and activation and finds that the distribution is characterized by coexisting symmetry and asymmetry, long tails. Specifically, we propose Distribution-Oriented Bound Initialization (DOBI), using different searching strategies to search a coarse bound for quantizers. To obtain refined quantizer parameters, we further propose Distillation Quantization Calibration (DQC), which employs a distillation approach to make the quantized model learn from its FP counterpart. Through extensive experiments on different bits and scaling factors, the performance of DOBI can reach the state-of-the-art (SOTA) while after stage two, our method surpasses existing PTQ in both metrics and visual effects. 2DQuant gains an increase in PSNR as high as 4.52dB on Set5 (x2) compared with SOTA when quantized to 2-bit and enjoys a 3.60x compression ratio and 5.08x speedup ratio. The code and models will be available at <a class="link-external link-https" href="https://github.com/Kai-Liu001/2DQuant" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

### Problems Addressed by the Paper This paper aims to address the significant performance degradation of image super-resolution (SR) models after low-bit quantization. Specifically, although low-bit quantization can compress model parameters and accelerate inference, this quantization method usually leads to severe performance degradation in Transformer-based SR models. This is due to the unique characteristics of the activation distribution in Transformer models, such as symmetry, asymmetry, and long-tail effects, making traditional quantization methods difficult to apply effectively. ### Background and Motivation 1. **Importance of Image Super-Resolution**: Image super-resolution is a classic low-level computer vision task widely used in fields such as medical imaging, surveillance, remote sensing, and mobile photography. 2. **Development of Deep Neural Networks**: With the development of deep neural networks, DNN-based SR models can reconstruct high-resolution images but require a large amount of storage and computational resources. 3. **Need for Model Compression**: To deploy these models on edge devices, they need to be compressed to reduce storage and computational requirements. Common compression methods include lightweight architecture design, pruning, and quantization. 4. **Challenges of Quantization Technology**: Existing quantization methods mainly target convolutional neural network (CNN) models and are less effective for Transformer-based models. In particular, the unique characteristics of the weight and activation distribution in Transformer models make traditional quantization methods difficult to adapt. ### Research Objectives 1. **Propose a Low-Bit Quantization Method for Transformer Models**: This method aims to achieve efficient model compression and acceleration while maintaining high performance. 2. **Address the Performance Degradation of Transformer Models after Quantization**: By designing new quantization strategies, the quantized model's performance can be close to or even exceed that of the full-precision model. ### Main Contributions 1. **First Exploration of Post-Training Quantization (PTQ) for Transformer-Based SR Models**: Proposing 2DQuant, a two-stage PTQ method that optimizes quantization boundaries using DOBI and DQC. 2. **First Stage: Distribution-Oriented Boundary Initialization (DOBI)**: Using a fast MSE search method to find rough quantization boundaries for data with different distributions. 3. **Second Stage: Distillation Quantization Calibration (DQC)**: Using knowledge distillation to allow the quantized model to learn from the full-precision model, further optimizing quantization parameters. 4. **Experimental Results**: 2DQuant outperforms existing state-of-the-art methods on multiple benchmark datasets, particularly in 2-bit quantization, where the PSNR metric improves by 4.52dB, achieving a 3.60x compression rate and a 5.08x acceleration ratio. ### Method Overview 1. **Analyze Data Distribution**: Detailed analysis of the weight and activation distribution in Transformer models reveals characteristics such as symmetry, asymmetry, and long-tail effects. 2. **Distribution-Oriented Boundary Initialization (DOBI)**: Adopting different search strategies based on data distribution to quickly find rough quantization boundaries. 3. **Distillation Quantization Calibration (DQC)**: Using knowledge distillation to allow the quantized model to learn from the full-precision model, further optimizing quantization parameters to ensure the output and intermediate feature layers of the quantized model are consistent with the full-precision model. ### Experimental Setup and Results 1. **Datasets**: Using DF2K as the training data, Set5 as the validation set, and the test set includes Set5, Set14, B100, Urban100, and Manga109. 2. **Evaluation Metrics**: Using PSNR and SSIM as evaluation metrics, calculating performance on the Y channel. 3. **Experimental Results**: 2DQuant significantly outperforms existing quantization methods on multiple benchmark datasets, particularly in 2-bit quantization, where the PSNR metric improves by 4.52dB. ### Conclusion By proposing the 2DQuant method, this paper successfully addresses the performance degradation of Transformer-based image super-resolution models after low-bit quantization, achieving efficient model compression and acceleration. This method provides a new solution for deploying high-performance SR models on edge devices.

2DQuant: Low-bit Post-Training Quantization for Image Super-Resolution

Hessian-based Mixed-Precision Quantization with Transition Aware Training for Neural Networks

Fully Quantized Image Super-Resolution Networks

Distribution-Flexible Subset Quantization for Post-Quantizing Super-Resolution Networks.

Bit-shrinking: Limiting Instantaneous Sharpness for Improving Post-training Quantization

Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective

PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution

Dynamic Dual Trainable Bounds for Ultra-low Precision Super-Resolution Networks

Hybrid Post-Training Quantization for Super-Resolution Neural Network Compression

EasyQuant: Post-training Quantization via Scale Optimization

Rate-Distortion Optimized Post-Training Quantization for Learned Image Compression

A Multi-precision Quantized Super-Resolution Model Framework

decoupleQ: Towards 2-bit Post-Training Uniform Quantization via decoupling Parameters into Integer and Floating Points

SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers

Optimization-based Post-training Quantization with Bit-split and Stitching

RAPQ: Rescuing Accuracy for Power-of-Two Low-bit Post-training Quantization

RepQuant: Towards Accurate Post-Training Quantization of Large Transformer Models via Scale Reparameterization

SQuant: On-the-Fly Data-Free Quantization Via Diagonal Hessian Approximation

FrameQuant: Flexible Low-Bit Quantization for Transformers

Learning Accurate Low-bit Quantization towards Efficient Computational Imaging