Abstract:Since the global self-attention mechanism can capture long-distance dependencies well, Transformer-based methods have achieved remarkable performance in many vision tasks, including single-image super-resolution (SISR). However, there are strong local self-similarities in images, if the global self-attention mechanism is still used for image processing, it may lead to excessive use of computing resources on parts of the image with weak correlation. Especially in the high-resolution large-size image, the global self-attention will lead to a large number of redundant calculations. To solve this problem, we propose the Enhanced Local Multi-windows Attention Network (ELMA), which contains two main designs. First, different from the traditional self-attention based on square window partition, we propose a Multi-windows Self-Attention (M-WSA) which uses a new window partitioning mechanism to obtain different types of local long-distance dependencies. Compared with original self-attention mechanisms commonly used in other SR networks, M-WSA reduces computational complexity and achieves superior performance through analysis and experiments. Secondly, we propose a Spatial Gated Network (SGN) as a feed-forward network, which can effectively overcome the problem of intermediate channel redundancy in traditional MLP, thereby improving the parameter utilization and computational efficiency of the network. Meanwhile, SGN introduces spatial information into the feed-forward network that traditional MLP cannot obtain. It can better understand and use the spatial structure information in the image, and enhances the network performance and generalization ability. Extensive experiments show that ELMA achieves competitive performance compared to state-of-the-art methods while maintaining fewer parameters and computational costs.

Low Redundant Attention Network for Efficient Image Super-Resolution.

Lightweight Multi-Attention Fusion Network for Image Super-Resolution

NoUCSR: Efficient Super-Resolution Network Without Upsampling Convolution.

Lightweight Attention-Guided Network for Image Super-Resolution

Parallel-Connected Residual Channel Attention Network for Remote Sensing Image Super-Resolution

Lightweight Image Super-Resolution Network Using 3D Convolutional Neural Networks

From Coarse to Fine: Hierarchical Pixel Integration for Lightweight Image Super-resolution

Lightweight image super-resolution with sliding Proxy Attention Network

Efficient Transformer for Single Image Super-Resolution.

Enhanced local multi-windows attention network for lightweight image super-resolution

Efficient Image Super-Resolution via Symmetric Visual Attention Network

Transformer-based image super-resolution and its lightweight

Efficient Single Image Super-Resolution with Entropy Attention and Receptive Field Augmentation

A Residual Network with Efficient Transformer for Lightweight Image Super-Resolution

Image Super-Resolution via Lightweight Attention-Directed Feature Aggregation Network

An Efficient Hybrid CNN-Transformer Approach for Remote Sensing Super-Resolution

Non-local self-attention network for image super-resolution

Low-Resolution Self-Attention for Semantic Segmentation

Lightweight Image Super-resolution with Local Attention Enhancement

Wavelet-based Residual Attention Network for Image Super-Resolution

Image Super-Resolution Using Very Deep Residual Channel Attention Networks