End-to-End Learned Scalable Multilayer Feature Compression for Machine Vision Tasks

Qiaoxi Chen,Changsheng Gao,Dong Liu
DOI: https://doi.org/10.1109/dcc58796.2024.00067
2024-01-01
Abstract:In the field of Video Coding for Machines (VCM), scalable feature compression has attracted attention for its potential to support a variety of machine vision tasks. However, the existing scalable feature compression methods exhibit limited performance. To address this problem, we propose an end-to-end learned scalable multilayer feature compression method in this paper. First, we propose to leverage an end-to-end feature compression method, which can efficiently exploit redundancy among features through a learning approach, to improve compression efficiency. Second, we introduce a novel strategy involving the use of the transformed latent of the base layer as the conditional information for the enhancement layer. Given the learnable nature of our compression method, we propose to optimize the base layer and the enhancement layer jointly. The joint optimization encourages the base layer to produce more suitable conditional information for the enhancement layer. Comparative experiments against existing feature compression and image compression methods verify our approach’s remarkable performance improvements.
What problem does this paper attempt to address?