Soft then Hard: Rethinking the Quantization in Neural Image Compression

Zongyu Guo,Zhizheng Zhang,Runsen Feng,Zhibo Chen
DOI: https://doi.org/10.48550/arXiv.2104.05168
2024-03-25
Abstract:Quantization is one of the core components in lossy image compression. For neural image compression, end-to-end optimization requires differentiable approximations of quantization, which can generally be grouped into three categories: additive uniform noise, straight-through estimator and soft-to-hard annealing. Training with additive uniform noise approximates the quantization error variationally but suffers from the train-test mismatch. The other two methods do not encounter this mismatch but, as shown in this paper, hurt the rate-distortion performance since the latent representation ability is weakened. We thus propose a novel soft-then-hard quantization strategy for neural image compression that first learns an expressive latent space softly, then closes the train-test mismatch with hard quantization. In addition, beyond the fixed integer quantization, we apply scaled additive uniform noise to adaptively control the quantization granularity by deriving a new variational upper bound on actual rate. Experiments demonstrate that our proposed methods are easy to adopt, stable to train, and highly effective especially on complex compression models.
Image and Video Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the mismatch between the training and testing phases and the weakened latent representation ability caused by quantization methods in neural image compression. Specifically: 1. **Mismatch between training and testing phases**: Using Additive Uniform Noise (AUN) to approximate quantization can achieve end - to - end optimization, but it will lead to inconsistency between the training and testing phases, thus affecting the Rate - Distortion Performance. 2. **Weakened latent representation ability**: Although the Straight - Through Estimator (STE) and Soft - to - Hard Annealing avoid the mismatch between the training and testing phases, these methods lack a regularization term during the training process, making it difficult to learn a smooth latent space, thereby weakening the latent representation ability. To solve these problems, the author proposes a new "Soft - then - Hard" (STH) quantization strategy, which combines the advantages of Additive Uniform Noise and hard quantization. In addition, Scaled Uniform Noise (SUN) is introduced to adaptively control the quantization granularity and further improve the performance of the compression model.