Abstract:Recently, learning-based image compression model has attracted much attention due to its impressive performance and ease of optimization, compared with traditional DCT and wavelet-based image compression standards. Most learning-based image compression models are trained to minimize joint rate-distortion (RD) loss on one single RD trade-off point. However, in many multimedia applications, due to communication constraints, or display adaptation needs for different spatial formats, bit rates or power, it is necessary to provide a variety of image versions for different client devices. To fulfill this requirement, typical end-to-end image compression methods have to compress an image into several bit streams independently by a number of pre-trained networks, which are resource-consuming because of redundancy among these streams. To address this problem, inspired by traditional scalable video coding framework, we propose a learning-based end-to-end quality and spatial scalable image compression (QSSIC) model in multi-layer structure, in which each layer could generate one bitstream corresponding to a specified resolution and image fidelity. This scalability is achieved by exploring the potential of feature-domain representation prediction and reuse. To be specific, firstly, bitstreams of previous layers are used to predict the current layer representations which contains the enhancement information, and then only prediction residuals need to be coded in enhancement layers. Secondly, previous bitstreams are reused in image reconstruction in higher layers to provide basic information. The proposed model could be optimized in an end-to-end manner. Extensive experiments show that our method outperforms state-of-art deep neural networks (DNN)-based auto-encoders in simulcast scenarios. In addition, our method has a better performance than the traditional scalable image compression method scalable extension of H.264/AVC (SVC) and is comparable to scalable extension of H.265/- EVC (SHVC).

Towards Extreme Image Compression with Latent Feature Guidance and Diffusion Prior

Diffusion-based Extreme Image Compression with Compressed Feature Initialization

Extreme Generative Image Compression by Learning Text Embedding from Diffusion Models

Extreme Video Compression with Pre-trained Diffusion Models

Lossy Image Compression with Conditional Diffusion Models

Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach

Extremely Low Bit-rate Image Compression Via Invertible Image Generation

Consistency Guided Diffusion Model with Neural Syntax for Perceptual Image Compression

Generative Latent Coding for Ultra-Low Bitrate Image Compression

Map-Assisted Remote-Sensing Image Compression at Extremely Low Bitrates

High Frequency Matters: Uncertainty Guided Image Compression with Wavelet Diffusion

A Unified End-to-End Framework for Efficient Deep Image Compression

Lossy Image Compression with Foundation Diffusion Models

Controllable Distortion-Perception Tradeoff Through Latent Diffusion for Neural Image Compression

Exploring Distortion Prior with Latent Diffusion Models for Remote Sensing Image Compression

Learning-Based Scalable Image Compression With Latent-Feature Reuse and Prediction

Extreme Image Compression using Fine-tuned VQGANs

PixRevive: Latent Feature Diffusion Model for Compressed Video Quality Enhancement

Coarse-to-Fine Hyper-Prior Modeling for Learned Image Compression

Learned Image Compression with Large Capacity and Low Redundancy of Latent Representation

Fine color guidance in diffusion models and its application to image compression at extremely low bitrates