Analysis on Compressed Domain: A Multi-Task Learning Approach

Yuefeng Zhang,Chuanmin Jia,Jianhui Chang,Siwei Ma
DOI: https://doi.org/10.1109/dcc52660.2022.00105
2022-01-01
Abstract:Image compression approaches based on deep learning have achieved remarkable success. Existing studies mainly focus on human vision and machine analysis tasks taking reconstructed images as input. However, those methods need images to be decoded before performing downstream visual tasks, which motivates us to explore how to directly conduct visual analysis using the compressed data without decoding. The overview of our proposed model is shown as Fig. 1(a). Specifically, a task-agnostic learning-based compression model is proposed, which effectively supports various compressed domain-based analytical tasks meanwhile reserves outstanding re-constructed perceptual quality compared with traditional and learning-based codecs. To obtain the extremely compacted data representation with essential semantic infor-mation, we take the help of the generative model on decoder part. Then, we propose a multi-task learning model which can directly obtain semantic information from the compressed visual data. The pipeline of the proposed model is detailedly illus-trated in Fig. 1(b). In addition, joint optimization strategy is adopted to achieve the best balance point among compression efficiency, reconstructed image quality, and the downstream visual tasks' performance. Experimental results verify that our proposed compressed domain-based multi-task analysis model outperforms the reconstructed image-based method on transmission efficiency, saving more than ten times of bit-rate consumption while preserving comparable visual analysis precision (i.e., classification and segmentation tasks) when compared with RGB image input models, which is evaluated on the CelebA-HO dataset.
What problem does this paper attempt to address?