Improving Single-Image Super-Resolution with Dilated Attention

Xinyu Zhang,Boyuan Cheng,Xiaosong Yang,Zhidong Xiao,Jianjun Zhang,Lihua You
DOI: https://doi.org/10.3390/electronics13122281
IF: 2.9
2024-06-12
Electronics
Abstract:Single-image super-resolution (SISR) techniques have become a vital tool for improving image quality and clarity in the rapidly evolving field of digital imaging. Convolutional neural network (CNN) and transformer-based SISR techniques are very popular. However, CNN-based techniques are not suitable when capturing long-range dependencies, and transformer-based techniques suffer from computational complexity. To tackle these problems, this paper proposes a novel method called dilated attention-based single-image super-resolution (DAIR). It comprises three components: low-level feature extraction, multi-scale dilated transformer block (MDTB), and high-quality image reconstruction. A convolutional layer is used to extract the base features from low-resolution images, which lays the foundation for subsequent processing. Dilated attention is introduced to MDTB to enhance its ability to capture image features at different scales and ensure superior image details and structure recovery. After that, MDTB refines these features to extract multi-scale global attributes and effectively grasps images' long-distance relationships and features across multiple scales. Finally, low-level features obtained from feature extraction and multi-scale global features obtained from MDTB are aggregated to reconstruct high-resolution images. The comparison with existing methods validates the efficacy of the proposed method and demonstrates its advantage in improving image resolution and quality.
engineering, electrical & electronic,computer science, information systems,physics, applied
What problem does this paper attempt to address?
This paper attempts to solve two main problems in single - image super - resolution (Single - Image Super - Resolution, SISR): 1. **Limitations of convolutional neural network (CNN) methods**: Although CNNs are very effective in processing local context information, they are insufficient in capturing long - distance dependencies. Due to the convolutional nature of CNNs, they mainly focus on local features, which limits their ability to integrate context information in a wider area, thus affecting the restoration effect of detailed and structural information of complex textures or patterns. 2. **Computational complexity problems of Transformer - based methods**: Although Transformers can model long - distance dependencies, their computational complexity grows quadratically with the increase of spatial resolution, which makes it impractical to use Transformers in high - resolution image restoration tasks. To solve the above problems, this paper proposes a new method - Dilated Attention - based Single - Image Super - Resolution (DAIR). DAIR expands the receptive field of the Transformer by introducing the dilated attention mechanism without increasing the computational complexity. Specifically, DAIR consists of three main modules: - **Low - level feature extraction**: Extract basic features from low - resolution images through a convolutional layer, laying the foundation for subsequent processing. - **Multi - scale dilated Transformer block (MDTB)**: Introduce the dilated attention mechanism to enhance the model's ability to capture image features at different scales, ensuring high - quality restoration of image details and structures. - **High - quality image reconstruction**: Aggregate low - level features and multi - scale global features to reconstruct high - resolution images. Through these designs, DAIR can not only effectively capture different - scale information of image features, but also significantly improve the quality of image detail and structure restoration, while solving the computational complexity problems existing in traditional methods.