Abstract:In recent years, layered image compression is demonstrated to be a promising direction, which encodes a compact representation of the input image and apply an up-sampling network to reconstruct the image. To further improve the quality of the reconstructed image, some works transmit the semantic segment together with the compressed image data. Consequently, the compression ratio is also decreased because extra bits are required for transmitting the semantic segment. To solve this problem, we propose a new layered image compression framework with encoder-decoder matched semantic segmentation (EDMS). And then, followed by the semantic segmentation, a special convolution neural network is used to enhance the inaccurate semantic segment. As a result, the accurate semantic segment can be obtained in the decoder without requiring extra bits. The experimental results show that the proposed EDMS framework can get up to 35.31% BD-rate reduction over the HEVC-based (BPG) codec, 5% bitrate, and 24% encoding time saving compare to the state-of-the-art semantic-based image codec.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to improve the quality of image compression without increasing the transmission of additional bits. Specifically, the existing semantic - based image compression methods enhance the quality of compressed images by transmitting semantic segmentation information, but this method requires additional bits to store and transmit these semantic information, thus reducing the compression ratio. To solve this problem, the paper proposes a new hierarchical image compression framework - Encoder - Decoder Matched Semantic Segmentation (EDMS). ### Main contributions: 1. **Avoid additional bit transmission**: By applying the same semantic segmentation network in the encoder and decoder respectively, the same semantic segmentation result can be obtained at the decoder end as at the encoder end without the need for additional transmission of semantic information. 2. **Improve the quality of reconstructed images**: In order to compensate for the inaccuracy of the semantic segmentation information extracted from the up - sampled images, the paper introduces a special convolutional neural network (CNN) to enhance the accuracy of semantic segmentation. 3. **Performance improvement**: The experimental results show that the proposed EDMS framework outperforms the existing traditional image compression methods and the state - of - the - art semantic - based image compression methods in terms of peak signal - to - noise ratio (PSNR) and multi - scale structural similarity (MS - SSIM) metrics. ### Technical details: - **Semantic segmentation network**: Apply the semantic segmentation network in the encoder and decoder respectively to ensure that the same semantic segmentation results are obtained at both ends. - **Semantic segmentation enhancement**: Use a special Recursive Residual Network (SMapNet) to enhance the semantic segmentation information extracted from the up - sampled images, making it closer to the semantic segmentation result of the original image. - **Overall framework**: It includes three main networks: CompNet (for down - sampling), FineNet (for image synthesis) and SMapNet (for semantic segmentation enhancement). ### Experimental results: - **Performance comparison**: On the ADE20K test set, the average PSNR of the EDMS framework in the range of 1.1 - 2.1 bpp is 2.5 dB higher than that of BPG. - **Processing speed**: Compared with DSSLIC, the EDMS framework reduces the bit rate by 5% at the same PSNR, and the encoding time is reduced by 24%. - **Generalization ability**: The test results on the Kodak data set show that the EDMS framework also performs well on images with different distributions, and the average PSNR in the range of 1.3 - 2.3 bpp is 2.4 dB higher than that of BPG. In conclusion, this paper proposes an innovative image compression method, which not only improves the quality of compressed images, but also avoids the transmission of additional bits, thus achieving significant improvements in compression ratio and processing speed.

Image Compression with Encoder-Decoder Matched Semantic Segmentation

A Unified End-to-End Framework for Efficient Deep Image Compression

Towards Semantically Scalable Image Coding Using Semantic Map.

Fast and High-Performance Learned Image Compression With Improved Checkerboard Context Model, Deformable Residual Module, and Knowledge Distillation

Multi-Modality Deep Network for Extreme Learned Image Compression

Semantic-Aware Visual Decomposition for Image Coding

Asymmetric Learned Image Compression with Multi-Scale Residual Block, Importance Scaling, and Post-Quantization Filtering

Unified and Scalable Deep Image Compression Framework for Human and Machine

An End-to-End Deep Learning Image Compression Framework Based on Semantic Analysis

<Emphasis Type="Italic">CodedVision</Emphasis>: Towards Joint Image Understanding and Compression via End-to-End Learning

Semantic Ensemble Loss and Latent Refinement for High-Fidelity Neural Image Compression

Layered Image Compression Using Scalable Auto-Encoder

Semantic Scalable Image Compression with Cross-Layer Priors.

Learned Image Compression With Gaussian-Laplacian-Logistic Mixture Model and Concatenated Residual Modules

Optimized Decoupled Structure with Non-Local Attention for Deep Image Compression

Enhanced Standard Compatible Image Compression Framework based on Auxiliary Codec Networks

Deep Image Compression Towards Machine Vision: A Unified Optimization Framework

Deep Image Compression Toward Machine Vision: A Unified Optimization Framework

Practical Learned Image Compression with Online Encoder Optimization