Learned Image Compression Using A Long and Short Attention Module

Zenghui Duan,Cheolkon Jung,Yang Liu,Ming Li
DOI: https://doi.org/10.1109/icip51287.2024.10647655
2024-01-01
Abstract:Latent representation based on hyper-prior auto-encoders is recently applied to end-to-end image compression that shows comparable performance to the latest Versatile Video Coding (VVC) intra coding. The rate-distortion efficiency of image compression is greatly affected by the latent representation extracted by an auto-encoder. In this paper, we propose learned image compression using a long and short attention (LSA) module. We introduce the LSA module into an autoencoder to obtain accurate latent representation of the image. The LSA module improves ability of extracting global and local image features for the auto-encoder, thereby saving bit rate and achieving higher compression efficiency. We add two LSA modules in the encoding stage and the decoding stage to improve the encoding and decoding capabilities of the learned image compression network. Experiments on the JPEG AI dataset show that the LSA module successfully reconstructs image details and thus the proposed method achieves state-of-the-art performance in terms of multi-scale structural similarity (MS-SSIM). Moreover, the proposed method outperforms the state-of-the-art methods at a low bitrate in terms of peak signal-to-noise ratio (PSNR).
What problem does this paper attempt to address?