BSSIC: Stereo Image Compression Based on Block Shift

Ya Qiao,Yongqi Zhai,Ronggang Wang
DOI: https://doi.org/10.1109/ijcnn60899.2024.10650674
2024-01-01
Abstract:We present a novel end-to-end stereo image compression method that capitalizes on the inherent inter-view similarity observed in stereo image pairs. Our primary strategy is block shift. The left image is compressed by a single-image compression method and the right image is possessed after that. Then, the latent representations of both images are partitioned into blocks. Considering that existing methods only provide relatively coarse estimations of the global shift, we conduct an iterative search to identify the optimal block shift for corresponding blocks, aiming to accurately capture the overlapping fields of views. On top of that, we also introduce a more powerful and efficient CNN backbone, ConvNeXts, in the encoders and decoders, ensuring compact latent representations. Additionally, we employ stereo attention modules during both the encoding and decoding stages to further exploit cross-view information. Experimental results demonstrate that the proposed BSSIC outperforms state-of-the-art methods on the Cityscapes dataset while being lightweight and significantly faster during decoding.
What problem does this paper attempt to address?