Abstract:In recent years, numerous learned video compression (LVC) methods have emerged, demonstrating rapid developments and satisfactory performance. However, in most previous methods, only the previous one frame is used as reference. Although some works introduce the usage of the previous multiple frames, the exploitation of temporal information is not comprehensive. Our proposed method not only utilizes the short-term information from multiple neighboring frames but also introduces long-term feature information as the reference, which effectively enhances the quality of the context and improves the compression efficiency. In our scheme, we propose the long-term information exploitation mechanism to capture long-term temporal relevance. The update and propagation of long-term information establish an implicit connection between the latent representation of the current frame and distant reference frames, aiding in the generation of long-term context. Meanwhile, the short-term neighboring frames are also utilized to extract local information and generate short-term context. The fusion of long-term context and short-term context results in a more comprehensive and high-quality context to achieve sufficient temporal information mining. Besides, the multiple frames information also helps to improve the efficiency of motion compression. They are used to generate the predicted motion and remove spatio-temporal redundancies in motion information by second-order motion prediction and fusion. Experimental results demonstrate that our proposed efficient learned video compression scheme can achieve 4.79% BD-rate saving when compared with H.266 (VTM).

Learning-Based Video Compression with Continuously Variable Bitrate Coding

Learned Video Compression with Adaptive Temporal Prior and Decoded Motion-aided Quality Enhancement

Modulated Variable-Rate Deep Video Compression

A Deeply Modulated Scheme for Variable-Rate Video Compression

Optimized Bit Allocation for Learning-based Video Compression.

Efficient Learned Video Compression via Bidirectional Temporal Information Exploration.

PFR-VC: Learning-Based Video Compression Framework with Predicted Frame Refinement

A Learning-Based Framework for Low Bit-Rate Image and Video Coding

Long-Term and Short-Term Information Propagation and Fusion for Learned Video Compression

M-LVC: Multiple Frames Prediction for Learned Video Compression

Learning-Based End-to-End Video Compression with Spatial-Temporal Adaptation.

Perceptual Friendly Variable Rate Image Compression

SigVIC: Spatial Importance Guided Variable-Rate Image Compression

Multi-rate Adaptive Transform Coding for Video Compression

Content Adaptive and Error Propagation Aware Deep Video Compression

Versatile Learned Video Compression

An End-to-End Learning Framework for Video Compression.

Content-adaptive Variable Resolution Framework for Intra Coding

Learning for Video Compression

Variable Bitrate Image Compression with Quality Scaling Factors

Neural Rate Control for Learned Video Compression