Competitive Learning for Achieving Content-specific Filters in Video Coding for Machines

Honglei Zhang,Jukka I. Ahonen,Nam Le,Ruiying Yang,Francesco Cricri

2024-06-18

Abstract:This paper investigates the efficacy of jointly optimizing content-specific post-processing filters to adapt a human oriented video/image codec into a codec suitable for machine vision tasks. By observing that artifacts produced by video/image codecs are content-dependent, we propose a novel training strategy based on competitive learning principles. This strategy assigns training samples to filters dynamically, in a fuzzy manner, which further optimizes the winning filter on the given sample. Inspired by simulated annealing optimization techniques, we employ a softmax function with a temperature variable as the weight allocation function to mitigate the effects of random initialization. Our evaluation, conducted on a system utilizing multiple post-processing filters within a Versatile Video Coding (VVC) codec framework, demonstrates the superiority of content-specific filters trained with our proposed strategies, specifically, when images are processed in blocks. Using VVC reference software VTM 12.0 as the anchor, experiments on the OpenImages dataset show an improvement in the BD-rate reduction from -41.3% and -44.6% to -42.3% and -44.7% for object detection and instance segmentation tasks, respectively, compared to independently trained filters. The statistics of the filter usage align with our hypothesis and underscore the importance of jointly optimizing filters for both content and reconstruction quality. Our findings pave the way for further improving the performance of video/image codecs.

Computer Vision and Pattern Recognition,Machine Learning,Multimedia

What problem does this paper attempt to address?

The paper aims to address the performance optimization of video encoding in machine vision tasks. Specifically, researchers have found that the artifacts produced by traditional video/image codecs are not only related to the compression ratio but also closely tied to the content of the input data. To adapt to data with different content, the researchers propose a new training strategy based on the principle of competitive learning to jointly optimize multiple content-specific post-processing filters. This method dynamically allocates training samples to different filters and uses simulated annealing techniques to mitigate the effects of random initialization, thereby improving the system's performance in machine vision tasks. Experimental results show that these jointly trained filters perform better than independently trained filters when processing blocky images, especially in object detection and instance segmentation tasks. Additionally, the usage statistics of the filters also confirm their dual optimization effect on content and reconstruction quality.

Competitive Learning for Achieving Content-specific Filters in Video Coding for Machines

Learned Image Coding for Machines: A Content-Adaptive Approach

Idam: Iteratively Trained Deep In-Loop Filter with Adaptive Model Selection

NR-CNN: Nested-Residual Guided CNN In-loop Filtering for Video Coding

An Integrated CNN-based Post Processing Filter For Intra Frame in Versatile Video Coding

Suboptimal video coding for machines method based on selective activation of in‐loop filter

Convolutional Neural Network Based In-Loop Filter for VVC Intra Coding

In-Loop Filtering via Trained Look-Up Tables

Advanced Fine-Tuning Procedures to Enhance DNN Robustness in Visual Coding for Machines

Efficient Adaptation of Neural Network Filter for Video Compression

A Unified Active and Semi-Supervised Learning Framework for Image Compression

A Universal Optimization Framework for Learning-based Image Codec

Optimized Non-local In-Loop Filter for Video Coding

Towards Coding for Human and Machine Vision: Scalable Face Image Coding

Optimize Neural Network Based In-Loop Filters Through Iterative Training.

A Learning-Based Low Complexity In-Loop Filter for Video Coding

Content-Aware Convolutional Neural Network for In-Loop Filtering in High Efficiency Video Coding

Joint Rate Distortion Optimization with CNN-based In-Loop Filter for Hybrid Video Coding

Rethinking the Joint Optimization in Video Coding for Machines: A Case Study

NN-VVC: Versatile Video Coding boosted by self-supervisedly learned image coding for machines

An Image Compression Framework with Learning-based Filter