Abstract:An important development direction in the Single-Image Super-Resolution (SISR) algorithms is to improve the efficiency of the algorithms. Recently, efficient Super-Resolution (SR) research focuses on reducing model complexity and improving efficiency through improved deep small kernel convolution, leading to a small receptive field. The large receptive field obtained by large kernel convolution can significantly improve image quality, but the computational cost is too high. To improve the reconstruction details of efficient super-resolution reconstruction, we propose a Symmetric Visual Attention Network (SVAN) by applying large receptive fields. The SVAN decomposes a large kernel convolution into three different combinations of convolution operations and combines them with an attention mechanism to form a Symmetric Large Kernel Attention Block (SLKAB), which forms a symmetric attention block with a bottleneck structure by the size of the receptive field in the convolution combination to extract depth features effectively as the basic component of the SVAN. Our network gets a large receptive field while minimizing the number of parameters and improving the perceptual ability of the model. The experimental results show that the proposed SVAN can obtain high-quality super-resolution reconstruction results using only about 30% of the parameters of existing SOTA methods.
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve
This paper aims to address the efficiency issues in Single-Image Super-Resolution (SISR) algorithms. Specifically, the authors seek to improve the efficiency of the model, reduce its complexity and computational cost, while maintaining or enhancing the quality of image reconstruction.
### Background and Motivation
With the rapid development of computer vision technology, the demand for high-resolution images is increasing. SISR technology can restore high-resolution images from degraded low-resolution images, providing better visual fidelity and enhanced image detail information. However, existing SISR models typically use complex deep network structures, resulting in high computational costs and a large number of parameters, which limits the practical application and deployment of these models.
### Main Contributions
1. **Efficient Large Kernel Attention Mechanism**:
- A Symmetric Visual Attention Network (SVAN) is proposed, which constructs a lightweight and efficient Symmetric Large Kernel Attention Block (SLKAB) by combining three convolution operations (5×5 depthwise convolution, 5×5 depthwise dilated convolution, and 1×1 pointwise convolution).
- This design reduces the number of parameters while expanding the receptive field of the network, improving the model's perception capability.
2. **Bottleneck Structure and Symmetric Arrangement**:
- The attention module with a bottleneck structure and symmetric arrangement further enhances the ability of feature extraction and multi-scale information fusion.
- The symmetric structure and bottleneck design improve the model's expressive and generalization capabilities while reducing computational costs.
3. **Experimental Validation**:
- Experimental results show that SVAN significantly outperforms existing efficient SISR methods in terms of the number of parameters and FLOPs, while maintaining high-quality image reconstruction effects.
### Method Overview
1. **Shallow Feature Extraction**:
- A 3×3 convolution layer is used to extract shallow features from the input image.
2. **Deep Feature Extraction**:
- Multiple SLKAB blocks are used for deep feature extraction, with each SLKAB block extracting feature information by combining receptive fields of different sizes.
3. **Pixel Shuffle Reconstruction**:
- After the deep feature extraction stage, a 3×3 depthwise dilated convolution layer is used to further reduce the number of parameters, and the feature map is fused with the initial feature map through residual connections.
- Finally, the reconstruction module (including a 3×3 convolution layer and a pixel shuffle layer) upsamples the feature map to a high-resolution image.
### Experimental Results
- **Quantitative Evaluation**:
- Experimental results on multiple benchmark datasets show that SVAN significantly outperforms existing efficient SISR methods in terms of the number of parameters and FLOPs, with only a slight decrease in PSNR and SSIM metrics.
- **Qualitative Evaluation**:
- Although slightly inferior in quantitative evaluation, the images generated by SVAN are visually superior to other methods, especially in detail reconstruction.
### Conclusion
This paper proposes a lightweight Symmetric Visual Attention Network (SVAN), which significantly improves the efficiency of SISR through an efficient large kernel attention mechanism and a symmetrically arranged bottleneck structure, while maintaining high-quality image reconstruction effects. Future work will focus on further improving the quantitative results of SVAN.