Residual based hierarchical feature compression for multi-task machine vision

Chaoran Chen,Mai Xu,Shengxi Li,Tie Liu,Minglang Qiao,Zhuoyi Lv
DOI: https://doi.org/10.1109/ICME55011.2023.00253
2023-01-01
Abstract:With the remarkable success of deep learning, image/video coding for machines (VCM) has been playing an important role in facilitating intelligent vision tasks. However, the existing VCM methods suffer from either sub-optimality of using image compression standards, or generalisation issues of learning-based methods. To address these issues, this paper proposes a residual-based hierarchical feature compression (RHFC) method to achieve optimal and universal feature compression for object detection and segmentation. More specifically, we first analyse the redundancy that exists in features at multiple scales, by finding that large-scale features are surprisingly less important to the vision tasks. Thus, we propose a pair of compression and enhancement networks to extract the very basic cues from the large-scale features, which are then compressed by the VVC codec. To compensate the inevitable detail loss, we further propose the hierarchical framework to compress the residuals between the reconstructed and original features, such that the performances can be significantly improved at low bit-rate cost. Experimental results have verified our superior performances, against both the state-of-the-art learning-based and standard feature compression methods. Our RHFC method also generalises well to other scenarios without the need of any further fine-tuning.
What problem does this paper attempt to address?