Rate-Distortion-Cognition Controllable Versatile Neural Image Compression

Jinming Liu,Ruoyu Feng,Yunpeng Qi,Qiuyu Chen,Zhibo Chen,Wenjun Zeng,Xin Jin
2024-07-17
Abstract:Recently, the field of Image Coding for Machines (ICM) has garnered heightened interest and significant advances thanks to the rapid progress of learning-based techniques for image compression and analysis. Previous studies often require training separate codecs to support various bitrate levels, machine tasks, and networks, thus lacking both flexibility and practicality. To address these challenges, we propose a rate-distortion-cognition controllable versatile image compression, which method allows the users to adjust the bitrate (i.e., Rate), image reconstruction quality (i.e., Distortion), and machine task accuracy (i.e., Cognition) with a single neural model, achieving ultra-controllability. Specifically, we first introduce a cognition-oriented loss in the primary compression branch to train a codec for diverse machine tasks. This branch attains variable bitrate by regulating quantization degree through the latent code channels. To further enhance the quality of the reconstructed images, we employ an auxiliary branch to supplement residual information with a scalable bitstream. Ultimately, two branches use a `$\beta x + (1 - \beta) y$' interpolation strategy to achieve a balanced cognition-distortion trade-off. Extensive experiments demonstrate that our method yields satisfactory ICM performance and flexible Rate-Distortion-Cognition controlling.
Computer Vision and Pattern Recognition,Image and Video Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to meet the requirements of simultaneously controlling the bit rate (Rate), image reconstruction quality (Distortion), and machine task accuracy (Cognition) in the field of image compression. Specifically, existing image compression methods usually require independent codecs for different bit rate levels, machine tasks, and network training, which not only increases the training cost and memory requirements but also leads to poor generalization ability. In addition, these methods cannot flexibly adjust the trade - off between decompression fidelity and task performance within a single framework. To solve these problems, the paper proposes a new neural image compression method - Rate - Distortion - Cognition Controllable Versatile Neural Image Compression. This method trains codecs that can support multiple machine tasks in the main compression branch by introducing a cognition - oriented loss function and combining a contrastive learning strategy. At the same time, variable bit rates are achieved by adjusting the quantization degree. To further improve the quality of the reconstructed image, this method also designs an auxiliary branch that uses an additional scalable bitstream to supplement the residual information. Finally, through the “\( \beta x+(1 - \beta)y\)” interpolation strategy, a balanced cognition - distortion trade - off is achieved, thereby achieving super - controllability of rate - distortion - cognition in one model. The main contributions of this method include: 1. Proposing a general neural image compression model that can achieve rate - distortion - cognition controllability in a single codec. 2. Introducing a contrastive - learning - based cognition - oriented loss function to generate compressed images that are more conducive to downstream tasks, and at the same time achieving variable bit rates through trainable gain units. 3. Exploring the relationship between the information required for distortion and cognition, and designing an auxiliary scalable bitstream to achieve a balanced cognition - distortion trade - off through a two - branch interpolation strategy. Through these innovations, this method not only improves the flexibility and practicality of image compression but also achieves a good balance between machine task performance and image reconstruction quality.