Abstract:The effective receptive field (ERF) plays an important role in transform coding, which determines how much redundancy can be removed during transform and how many spatial priors can be utilized to synthesize textures during inverse transform. Existing methods rely on stacks of small kernels, whose ERFs remain insufficiently large, or heavy non-local attention mechanisms, which limit the potential of high-resolution image coding. To tackle this issue, we propose Large Receptive Field Transform Coding with Adaptive Weights for Learned Image Compression (LLIC). Specifically, for the first time in the learned image compression community, we introduce a few large kernelbased depth-wise convolutions to reduce more redundancy while maintaining modest complexity. Due to the wide range of image diversity, we further propose a mechanism to augment convolution adaptability through the self-conditioned generation of weights. The large kernels cooperate with non-linear embedding and gate mechanisms for better expressiveness and lighter pointwise interactions. Our investigation extends to refined training methods that unlock the full potential of these large kernels. Moreover, to promote more dynamic inter-channel interactions, we introduce an adaptive channel-wise bit allocation strategy that autonomously generates channel importance factors in a self-conditioned manner. To demonstrate the effectiveness of the proposed transform coding, we align the entropy model to compare with existing transform methods and obtain models LLIC-STF, LLIC-ELIC, and LLIC-TCM. Extensive experiments demonstrate that our proposed LLIC models have significant improvements over the corresponding baselines and reduce the BD-Rate by 9.49%, 9.47%, 10.94% on Kodak over VTM-17.0 Intra, respectively. Our LLIC models achieve state-of-the-art performances and better trade-offs between performance and complexity.

Low-complexity Transform Network Architecture for JPEG AI Image Codec.

EARN: Toward Efficient and Robust JPEG Compression Artifact Reduction

Bit Rate Matching Algorithm Optimization in JPEG-AI Verification Model

Aligned Intra Prediction and Hyper Scale Decoder Under Multistage Context Model for JPEG AI

Asymmetric Learned Image Compression with Multi-Scale Residual Block, Importance Scaling, and Post-Quantization Filtering

Low Complexity Depth Coding Assisted by Coding Information From Color Video

Optimized Decoupled Structure with Non-Local Attention for Deep Image Compression

High-Efficiency Lossy Image Coding Through Adaptive Neighborhood Information Aggregation

A Flexible and Configurable Architecture of Software and Hardware for JPEG Codec

Neural Network Assisted Lifting Steps For Improved Fully Scalable Lossy Image Compression in JPEG 2000

Efficient Learned Lossless JPEG Recompression

Bit Distribution Study and Implementation of Spatial Quality Map in the JPEG-AI Standardization

Joint Hierarchical Priors and Adaptive Spatial Resolution for Efficient Neural Image Compression

LLIC: Large Receptive Field Transform Coding with Adaptive Weights for Learned Image Compression

A Universal Optimization Framework for Learning-based Image Codec

On Efficient Neural Network Architectures for Image Compression

Lossless Recompression of JPEG Images Using Transform Domain Intra Prediction

An Energy Efficient JPEG Encoder with Neural Network Based Approximation and Near-Threshold Computing.

Low-complexity Overfitted Neural Image Codec

Standard compliant video coding using low complexity, switchable neural wrappers

Rate-Distortion-Cognition Controllable Versatile Neural Image Compression