Image Coding for Machines with Omnipotent Feature Learning

Ruoyu Feng,Xin Jin,Zongyu Guo,Runsen Feng,Yixin Gao,Tianyu He,Zhizheng Zhang,Simeng Sun,Zhibo Chen
DOI: https://doi.org/10.48550/arXiv.2207.01932
2022-07-07
Abstract:Image Coding for Machines (ICM) aims to compress images for AI tasks analysis rather than meeting human perception. Learning a kind of feature that is both general (for AI tasks) and compact (for compression) is pivotal for its success. In this paper, we attempt to develop an ICM framework by learning universal features while also considering compression. We name such features as omnipotent features and the corresponding framework as Omni-ICM. Considering self-supervised learning (SSL) improves feature generalization, we integrate it with the compression task into the Omni-ICM framework to learn omnipotent features. However, it is non-trivial to coordinate semantics modeling in SSL and redundancy removing in compression, so we design a novel information filtering (IF) module between them by co-optimization of instance distinguishment and entropy minimization to adaptively drop information that is weakly related to AI tasks (e.g., some texture redundancy). Different from previous task-specific solutions, Omni-ICM could directly support AI tasks analysis based on the learned omnipotent features without joint training or extra transformation. Albeit simple and intuitive, Omni-ICM significantly outperforms existing traditional and learning-based codecs on multiple fundamental vision tasks.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to compress images efficiently in machine - intelligent analysis tasks while maintaining the semantic information that is important for machines. Existing image compression methods mainly focus on the optimization of human - perceived quality, such as by minimizing the mean - squared error (MSE) or maximizing the multi - scale structural similarity (MS - SSIM), and these methods perform poorly when supporting high - order machine - vision tasks (such as object detection, instance segmentation, etc.). This is because there are differences between the optimization indicators of human - perceived quality and those of machine - task performance (such as classification accuracy). To meet this challenge, the paper proposes a new framework named Omni - ICM, aiming to learn a feature that can be widely applied to different intelligent tasks and is also compact enough to be conducive to compression, namely the so - called "omnipotent features". Specifically, the paper achieves this goal by combining the self - supervised learning (SSL) and information filtering (IF) modules. The IF module is used to intelligently discard redundant information that is weakly related to AI tasks, thereby encouraging the learned representations to be sparser and more compact. In this way, not only can the compression efficiency be improved, but also the compressed features can be ensured to have good applicability for multiple downstream tasks. In summary, the core problem of this paper is to develop a new image - coding framework that can effectively compress images to support machine - intelligent analysis tasks while maintaining key semantic information and overcoming the limitations of existing compression methods in machine tasks.