Abstract:Recently, learning-based image compression model has attracted much attention due to its impressive performance and ease of optimization, compared with traditional DCT and wavelet-based image compression standards. Most learning-based image compression models are trained to minimize joint rate-distortion (RD) loss on one single RD trade-off point. However, in many multimedia applications, due to communication constraints, or display adaptation needs for different spatial formats, bit rates or power, it is necessary to provide a variety of image versions for different client devices. To fulfill this requirement, typical end-to-end image compression methods have to compress an image into several bit streams independently by a number of pre-trained networks, which are resource-consuming because of redundancy among these streams. To address this problem, inspired by traditional scalable video coding framework, we propose a learning-based end-to-end quality and spatial scalable image compression (QSSIC) model in multi-layer structure, in which each layer could generate one bitstream corresponding to a specified resolution and image fidelity. This scalability is achieved by exploring the potential of feature-domain representation prediction and reuse. To be specific, firstly, bitstreams of previous layers are used to predict the current layer representations which contains the enhancement information, and then only prediction residuals need to be coded in enhancement layers. Secondly, previous bitstreams are reused in image reconstruction in higher layers to provide basic information. The proposed model could be optimized in an end-to-end manner. Extensive experiments show that our method outperforms state-of-art deep neural networks (DNN)-based auto-encoders in simulcast scenarios. In addition, our method has a better performance than the traditional scalable image compression method scalable extension of H.264/AVC (SVC) and is comparable to scalable extension of H.265/- EVC (SHVC).

FICNet: an End to End Network for Free-view Image Coding

IEF-CSNET: Information Enhancement and Fusion Network for Compressed Sensing Reconstruction

Learning-Based Scalable Image Compression With Latent-Feature Reuse and Prediction

An End-to-End Compression Framework Based on Convolutional Neural Networks

An End-to-End Learning Framework for Video Compression.

A Unified End-to-End Framework for Efficient Deep Image Compression

<Emphasis Type="Italic">CodedVision</Emphasis>: Towards Joint Image Understanding and Compression via End-to-End Learning

End-to-end Varifocal Multiview Images Coding Framework from Data Acquisition End to Vision Application End

FFCA-Net: Stereo Image Compression via Fast Cascade Alignment of Side Information

End-to-end Compression Towards Machine Vision: Network Architecture Design and Optimization

DVC: An End-to-end Deep Video Compression Framework

Deep Convolutional Neural Network For Decompressed Video Enhancement

Learning a Virtual Codec Based on Deep Convolutional Neural Network to Compress Image

End-to-End Learned Scalable Multilayer Feature Compression for Machine Vision Tasks

Efficient Learning Based Sub-pixel Image Compression.

End-to-End Learnt Image Compression via Non-Local Attention Optimization and Improved Context Modeling

Mixed-Resolution Image Representation and Compression with Convolutional Neural Networks.

FVC: An End-to-End Framework Towards Deep Video Compression in Feature Space

FVC: A New Framework Towards Deep Video Compression in Feature Space

End-to-end feature domain residual coding network for multispectral image compression based on interspectral prediction

End-to-End Image Compression Via Attention-Guided Information-Preserving Module