Exploiting Inter-Image Similarity Prior for Low-Bitrate Remote Sensing Image Compression

Junhui Li,Xingsong Hou
2024-07-17
Abstract:Deep learning-based methods have garnered significant attention in remote sensing (RS) image compression due to their superior performance. Most of these methods focus on enhancing the coding capability of the compression network and improving entropy model prediction accuracy. However, they typically compress and decompress each image independently, ignoring the significant inter-image similarity prior. In this paper, we propose a codebook-based RS image compression (Code-RSIC) method with a generated discrete codebook, which is deployed at the decoding end of a compression algorithm to provide inter-image similarity prior. Specifically, we first pretrain a high-quality discrete codebook using the competitive generation model VQGAN. We then introduce a Transformer-based prediction model to align the latent features of the decoded images from an existing compression algorithm with the frozen high-quality codebook. Finally, we develop a hierarchical prior integration network (HPIN), which mainly consists of Transformer blocks and multi-head cross-attention modules (MCMs) that can query hierarchical prior from the codebook, thus enhancing the ability of the proposed method to decode texture-rich RS images. Extensive experimental results demonstrate that the proposed Code-RSIC significantly outperforms state-of-the-art traditional and learning-based image compression algorithms in terms of perception quality. The code will be available at \url{<a class="link-external link-https" href="https://github.com/mlkk518/Code-RSIC/" rel="external noopener nofollow">this https URL</a>
Computer Vision and Pattern Recognition,Image and Video Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **How to use the inter - image similarity prior to improve the perceptual quality of low - bit - rate remote sensing image compression**. Specifically, although the existing deep - learning - based remote sensing image compression methods have improved in terms of coding ability and entropy model prediction accuracy, they usually compress and decompress each image independently, ignoring the inter - image similarity prior. This ignorance leads to poor perceptual quality of the decoded remote sensing images at low bit - rates, especially in texture - rich areas. To solve this problem, the author proposes a codebook - based remote sensing image compression method (Code - RSIC), which generates a discrete codebook to provide the inter - image similarity prior and uses this prior information at the decoding end to enhance the decoding performance. The specific steps are as follows: 1. **Codebook Learning (Stage I)**: - Use VQGAN to pre - train a high - quality discrete codebook. - Optimize the codebook by minimizing the reconstruction loss, the perceptual loss, and the adversarial loss. 2. **Transformer - based Codebook Lookup (Stage II)**: - Introduce a Transformer module to predict the code sequence from the decoded low - quality features. - Use the cross - entropy loss and the L2 loss to train and fine - tune the Transformer module and the encoder. 3. **Hierarchical Prior Integration Network (Stage III)**: - Build a hierarchical prior integration network (HPIN) containing Transformer blocks and multi - head cross - attention modules (MCMs) to query the hierarchical prior information in the codebook. - Improve the quality of the decoded image by fusing the intermediate features and the codebook prior information. Through these steps, Code - RSIC can significantly improve the perceptual quality of remote sensing images at low bit - rates, surpassing the existing traditional and learning - based image compression algorithms. ### Summary of Key Formulas - **Quantization Operation**: \[ F_c(u,v)=\arg\min_{c_n\in C}\|F_h(u,v)-c_n\| \] \[ S(u,v)=\arg\min_n\|F_h(u,v)-c_n\| \] - **Loss Function**: - Codebook Learning Stage: \[ L_{s1}=L_{rec}+L_{per}+L_{cl}+\lambda_1L_{adv} \] where, \[ L_{cl}=\|SG[F_h]-F_c\|^2+\alpha\|F_h - SG[F_c]\|^2 \] \[ \lambda_1=\frac{\|\nabla_{D_H}[L_{rec}]\|}{\|\nabla_{D_H}[L_{adv}]\|+\epsilon} \] - Codebook Lookup Stage: \[ L_{s2}=L_{qf}+\lambda_2L_{ce} \] where, \[ L_{ce}=\sum_{n = 0}^{N - 1}-S_n\log(\hat{S}_n) \] \[ L_{qf}=\|F_l - SG(F_c)\|^2 \] - Hierarchical Prior Integration Stage: \[ L_{s3}=L_{s2}+L'_{rec}+L'_{per}+\lambda_3L'_{adv} \] where, \[ \lambda_3=\frac{\|\nabla_{DP}[L'_{rec}]\|}{\|\nabla_{DP}[L'_{adv}]\|+\epsilon}