MPAI-EEV: Standardization Efforts of Artificial Intelligence based End-to-End Video Coding

Chuanmin Jia,Feng Ye,Fanke Dong,Kai Lin,Leonardo Chiariglione,Siwei Ma,Huifang Sun,Wen Gao

2023-09-14

Abstract:The rapid advancement of artificial intelligence (AI) technology has led to the prioritization of standardizing the processing, coding, and transmission of video using neural networks. To address this priority area, the Moving Picture, Audio, and Data Coding by Artificial Intelligence (MPAI) group is developing a suite of standards called MPAI-EEV for "end-to-end optimized neural video coding." The aim of this AI-based video standard project is to compress the number of bits required to represent high-fidelity video data by utilizing data-trained neural coding technologies. This approach is not constrained by how data coding has traditionally been applied in the context of a hybrid framework. This paper presents an overview of recent and ongoing standardization efforts in this area and highlights the key technologies and design philosophy of EEV. It also provides a comparison and report on some primary efforts such as the coding efficiency of the reference model. Additionally, it discusses emerging activities such as learned Unmanned-Aerial-Vehicles (UAVs) video coding which are currently planned, under development, or in the exploration phase. With a focus on UAV video signals, this paper addresses the current status of these preliminary efforts. It also indicates development timelines, summarizes the main technical details, and provides pointers to further points of reference. The exploration experiment shows that the EEV model performs better than the state-of-the-art video coding standard H.266/VVC in terms of perceptual evaluation metric.

Multimedia,Image and Video Processing

What problem does this paper attempt to address?

The problem this paper attempts to address is the standardization of video processing, encoding, and transmission using neural network technology to improve video data compression efficiency. Specifically, the paper explores how to reduce the number of bits required to represent high-fidelity video data through end-to-end optimized neural video coding (End-to-End Video Coding, EEV), thereby breaking through the performance bottlenecks of traditional hybrid frameworks. The paper mentions that the MPAI (Moving Picture, Audio, and Data Coding by Artificial Intelligence) organization is developing a series of standards called MPAI-EEV, aimed at achieving this goal using data-trained neural coding technology. The paper also pays special attention to the application of unmanned aerial vehicle (UAV) video coding, as such videos have specific motion characteristics and lens distortion properties that traditional encoding methods struggle to handle effectively. By introducing learning-based encoding tools and techniques, such as prediction refinement networks, coarse-to-fine residual modeling, and in-loop restoration networks, the paper demonstrates that the EEV model outperforms the latest video coding standard H.266/VVC in perceptual evaluation metrics. In summary, this paper aims to advance video coding technology, particularly in the area of AI-driven end-to-end video coding standards, to meet the demands of future video formats and content.

MPAI-EEV: Standardization Efforts of Artificial Intelligence based End-to-End Video Coding

Video Coding for Machines: A Paradigm of Collaborative Compression and Intelligent Analytics

Neural Video Coding Using Multiscale Motion Compensation and Spatiotemporal Context Model

MPEG Internet Video Coding Standard and Its Performance Evaluation

Designs and Implementations in Neural Network-based Video Coding

Recent Standard Development Activities on Video Coding for Machines

Deep Video Compression with Scaled Hierarchical Bi-directional Motion Model

Overview of MPEG Internet Video Coding

Advances in Video Compression System Using Deep Neural Network: A Review and Case Studies

Towards Next Generation Video Coding: from Neural Network Based Predictive Coding to In-Loop Filtering

NN-VVC: Versatile Video Coding boosted by self-supervisedly learned image coding for machines

A Neural-network Enhanced Video Coding Framework beyond ECM

An Emerging Coding Paradigm VCM: A Scalable Coding Approach Beyond Feature and Signal

Coding Tools Investigation for Next Generation Video Coding Based on Hevc

Video Coding for Machines: Compact Visual Representation Compression for Intelligent Collaborative Analytics

Image and Video Compression with Neural Networks: A Review

Overview of the Versatile Video Coding (VVC) Standard and its Applications

Deep Video Precoding

Overview of Intelligent Video Coding: from Model-Based to Learning-Based Approaches

PEA265: Perceptual Assessment of Video Compression Artifacts

Towards Coding for Human and Machine Vision: Scalable Face Image Coding