Abstract:Recently deep learning-based image compression methods have achieved significant achievements and gradually outperformed traditional approaches including the latest standard Versatile Video Coding (VVC) in both PSNR and MS-SSIM metrics. Two key components of learned image compression are the entropy model of the latent representations and the encoding/decoding network architectures. Various models have been proposed, such as autoregressive, softmax, logistic mixture, Gaussian mixture, and Laplacian. Existing schemes only use one of these models. However, due to the vast diversity of images, it is not optimal to use one model for all images, even different regions within one image. In this paper, we propose a more flexible discretized Gaussian-Laplacian-Logistic mixture model (GLLMM) for the latent representations, which can adapt to different contents in different images and different regions of one image more accurately and efficiently, given the same complexity. Besides, in the encoding/decoding network design part, we propose a concatenated residual blocks (CRB), where multiple residual blocks are serially connected with additional shortcut connections. The CRB can improve the learning ability of the network, which can further improve the compression performance. Experimental results using the Kodak, Tecnick-100 and Tecnick-40 datasets show that the proposed scheme outperforms all the leading learning-based methods and existing compression standards including VVC intra coding (4:4:4 and 4:2:0) in terms of the PSNR and MS-SSIM. The source code is available at \url{<a class="link-external link-https" href="https://github.com/fengyurenpingsheng" rel="external noopener nofollow">this https URL</a>}

What problem does this paper attempt to address?

The problem that this paper attempts to solve is that in image compression, although existing deep - learning - based methods have achieved remarkable achievements and gradually surpassed traditional compression methods (such as the latest standard Versatile Video Coding, VVC) in PSNR and MS - SSIM metrics, there are still two main problems: 1. **Singularity of entropy models**: Existing schemes usually use only one probability model (such as autoregressive model, Softmax, Logistic mixture model, Gaussian mixture model or Laplacian model) to represent the distribution of latent representations. However, due to the diversity of image content, using a single model to model all images or different regions of the same image is not the optimal choice. 2. **Spatial redundancy**: Even after using complex network structures, there is still a certain amount of spatial redundancy in the latent representation, which affects the compression performance. To address these problems, the authors propose the following solutions: - **A more flexible mixture model**: A discretized Gaussian - Laplacian - Logistic Mixture Model (GLLMM) is proposed, which can adapt to the content of different images and different regions of the same image more accurately and efficiently while maintaining the same complexity. - **An improved encoding/decoding network architecture**: The Concatenated Residual Blocks (CRB) are introduced. Through the serial connection of multiple residual blocks and additional shortcut connections, the learning ability of the network is improved, thereby further enhancing the compression performance. The experimental results show that the proposed scheme outperforms the existing leading learning methods and traditional compression standards (including the intra - coding of VVC) in PSNR and MS - SSIM metrics on Kodak, Tecnick - 100 and Tecnick - 40 datasets. Specifically, for the Kodak dataset, when the bit rate is higher than 0.4 bpp, this method is 0.2 - 0.3 dB higher than other methods; for the Tecnick dataset, when the bit rate is higher than 0.2 bpp, this method is 0.3 - 0.4 dB higher than VVC (4:4:4). These results represent the current state - of - the - art in learning - based image compression.

Learned Image Compression with Gaussian-Laplacian-Logistic Mixture Model and Concatenated Residual Modules

Learned Image Compression With Gaussian-Laplacian-Logistic Mixture Model and Concatenated Residual Modules

Learned Image Compression with Inception Residual Blocks and Multi-Scale Attention Module.

Asymmetric Learned Image Compression with Multi-Scale Residual Block, Importance Scaling, and Post-Quantization Filtering

Learned Image Compression with Discretized Gaussian Mixture Likelihoods and Attention Modules

Learned image compression via neighborhood-based attention optimization and context modeling with multi-scale guiding

MLIC: Multi-Reference Entropy Model for Learned Image Compression

Fast and High-Performance Learned Image Compression With Improved Checkerboard Context Model, Deformable Residual Module, and Knowledge Distillation

MLIC++: Linear Complexity Multi-Reference Entropy Modeling for Learned Image Compression

Learning-Based Scalable Image Compression With Latent-Feature Reuse and Prediction

An Extended Context-Based Entropy Hybrid Modeling for Image Compression.

Learned Image Compression with Large Capacity and Low Redundancy of Latent Representation

Latent-Separated Global Prediction for Learned Image Compression.

Multi-Modality Deep Network for Extreme Learned Image Compression

Learned Image Compression with Dual-Branch Encoder and Conditional Information Coding

A Unified End-to-End Framework for Efficient Deep Image Compression

Learned Lossless Image Compression With Combined Autoregressive Models And Attention Modules

MLIC++: Linear Complexity Attention-based Multi-Reference Entropy Modeling for Learned Image Compression

Deep Image Compression with Residual Learning

Efficient Learned Image Compression with Selective Kernel Residual Module and Channel-Wise Causal Context Model.