MambaCSR: Dual-Interleaved Scanning for Compressed Image Super-Resolution With SSMs

Yulin Ren,Xin Li,Mengxi Guo,Bingchen Li,Shijie Zhao,Zhibo Chen
2024-08-22
Abstract:We present MambaCSR, a simple but effective framework based on Mamba for the challenging compressed image super-resolution (CSR) task. Particularly, the scanning strategies of Mamba are crucial for effective contextual knowledge modeling in the restoration process despite it relying on selective state space modeling for all tokens. In this work, we propose an efficient dual-interleaved scanning paradigm (DIS) for CSR, which is composed of two scanning strategies: (i) hierarchical interleaved scanning is designed to comprehensively capture and utilize the most potential contextual information within an image by simultaneously taking advantage of the local window-based and sequential scanning methods; (ii) horizontal-to-vertical interleaved scanning is proposed to reduce the computational cost by leaving the redundancy between the scanning of different directions. To overcome the non-uniform compression artifacts, we also propose position-aligned cross-scale scanning to model multi-scale contextual information. Experimental results on multiple benchmarks have shown the great performance of our MambaCSR in the compressed image super-resolution task. The code will be soon available in~\textcolor{magenta}{\url{<a class="link-external link-https" href="https://github.com/renyulin-f/MambaCSR" rel="external noopener nofollow">this https URL</a>}}.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Problems Addressed by the Paper This paper aims to address the challenges in the Compressed Image Super-Resolution (CSR) task. Unlike traditional Single Image Super-Resolution (SISR), the CSR task needs to handle severe mixed distortions caused by both compression and low resolution. These distortions include block effects, ringing effects, color shifts, etc., and are accompanied by the loss of critical information. These issues impose higher demands on the contextual information modeling capabilities of existing super-resolution models. ### Main Contributions 1. **Proposed MambaCSR Framework**: This is the first CSR method based on the Mamba framework. By introducing the Dual-Interleaved Scanning (DIS) strategy, it activates Mamba's more comprehensive and efficient contextual information modeling capabilities in the CSR task. 2. **Designed Dual-Interleaved Scanning Strategy**: - **Hierarchical Interleaved Scanning**: Integrates local and long-range contextual information. - **Horizontal to Vertical Interleaved Scanning**: Reduces redundancy between different scanning directions, lowering computational costs. 3. **Introduced Position-Aligned Cross-Scale Scanning Strategy**: Integrates multi-scale contextual information, improving the ability to handle non-uniform distortions. 4. **Experimental Validation**: Experimental results on multiple benchmark datasets show that MambaCSR performs excellently in the compressed image super-resolution task. ### Background and Motivation - **Limitations of Existing Methods**: Existing super-resolution models struggle with long-range contextual information, especially when dealing with complex distortions in compressed images. - **Advantages of the Mamba Framework**: The Mamba framework utilizes the Selective State Space Model (SSM), which can dynamically decide how much learned knowledge each token retains in the scanning trajectory, effectively modeling long-range contextual information. - **Special Needs of the CSR Task**: The distortions in the CSR task are diverse and non-uniform, requiring the extraction of the most relevant information from all tokens in the image. Therefore, both local dependencies and long-range contextual information are crucial. ### Method Overview 1. **Dual-Interleaved Scanning (DIS)**: - **Hierarchical Interleaved Scanning**: Alternates between window scanning and sequence scanning to extract both local and long-range contextual information. - **Horizontal to Vertical Interleaved Scanning**: Reduces redundancy among the four scanning directions, improving computational efficiency. 2. **Position-Aligned Cross-Scale Scanning**: Aligns tokens at the same position across different scales for scanning, integrating multi-scale contextual information and enhancing the ability to handle non-uniform distortions. ### Experimental Results - **Quantitative Results**: Experimental results on multiple compressed benchmark datasets show that MambaCSR outperforms existing SOTA methods in terms of PSNR and SSIM metrics. - **Qualitative Results**: Visual comparisons demonstrate that MambaCSR excels in handling compression distortions, restoring textures, and details. - **Computational Efficiency**: The dual-interleaved scanning method significantly reduces computational complexity while maintaining performance. ### Conclusion By introducing dual-interleaved scanning and position-aligned cross-scale scanning strategies, MambaCSR successfully addresses the challenges in the compressed image super-resolution task, demonstrating superior performance and computational efficiency in handling complex distortions.