Optimized Decoupled Structure with Non-Local Attention for Deep Image Compression

Xuanye Zhang,Zhaobin Zhang,Yaojun Wu,Semih Esenlik,Xiaoyan Sun,Kai Zhang,Li Zhang
DOI: https://doi.org/10.1109/icip51287.2024.10648246
2024-01-01
Abstract:Recently, a decoupled framework for learning-based image compression has been proposed and adopted into the JPEG AI image coding standard developed by ISO/IEC WG1. The decoupled structure disentangles the sample reconstruction process and the entropy decoding process, making the decoding extremely fast. The corresponding techniques constitute the essential parts of the JPEG AI verification model software. However, its analysis transform and synthesis transform are relatively simple, which are built with stacked convolution layers, thereby may lack the capability to interpret data correlations. In this work, we enhance the transform networks by introducing the non-local attention mechanism, which has proven efficient in image compression tasks. The proposed framework thus shares the merits of the fast decoding from the decoupled architecture and the strong transform capabilities from the non-local attention, making it a stronger candidate for practical end-to-end image codec deployment. Experimental results on the Kodak test set and JPEG AI CfP test set show that our method achieves better BDRate performance compared to the original Decoupled-anchor and significantly faster decoding speed compared to NIC. The proposed solution has been adopted by the IEEE 1857.11 Working Subgroup (1857.11 WSG) in developing neural network-based image coding standards in the 10th Meeting.
What problem does this paper attempt to address?