Abstract:The utilization of self-attention mechanisms in Transformer-based methods has shown great potential in addressing the image super-resolution (SR) task by capturing long-range dependencies. However, many existing Transformer-based methods for SR extract features locally within a small window and rely on shifted window self-attention to gradually incorporate long-range dependencies. These methods may not effectively exploit non-local image information for SR. To overcome this limitation, we propose a novel non-local self-attention (NLSA) mechanism that directly models non-local dependencies. Firstly, NLSA utilizes locality-sensitive hashing to identify similar pixel-wise features with minimal computational cost. Next, a pixel-shuffling operation is applied to gather similar features within the same window. This pixel-shuffling technique effectively expands the receptive field beyond the window size. Furthermore, we introduce a simplified window self-attention (SiWSA) that operates within each window to capture intrinsic long-term dependencies among the shuffled features, regardless of the position information. Finally, after the SiWSA calculation, the features are shuffled back to their original positions to maintain data consistency. This overall NLSA mechanism enables the capture of non-local information without the need for excessively deep networks to enlarge the receptive field. Based on NLSA, we propose a non-local self-attention network (NLSAN) designed explicitly for the SR task. Through extensive experimental evaluations, we demonstrate the superior performance of NLSAN compared to several state-of-the-art SR methods in quantitative and qualitative assessments. The code of the proposed method is available at https://github.com/zengkun301/NLSAN.

SANet: Face Super-Resolution Based on Self-Similarity Prior and Attention Integration

Selective Domain-Invariant Feature Alignment Network for Face Anti-Spoofing.

SCTANet: A Spatial Attention-Guided CNN-Transformer Aggregation Network for Deep Face Image Super-Resolution

Self-attention learning network for face super-resolution

TANet: A new Paradigm for Global Face Super-resolution via Transformer-CNN Aggregation Network

Facial mask attention network for identity-aware face super-resolution

Attention-Guided Multi-scale Interaction Network for Face Super-Resolution

FSRNet: End-to-End Learning Face Super-Resolution with Facial Priors.

The face image super-resolution algorithm based on combined representation learning

Multi-level landmark-guided deep network for face super-resolution

MSRFSR: Multi-Stage Refining Face Super-Resolution With Iterative Collaboration Between Face Recovery and Landmark Estimation

CVANet: Cascaded visual attention network for single image super-resolution

Deep Face Super-Resolution with Iterative Collaboration between Attentive Recovery and Landmark Estimation

Second-Order Attention Network For Single Image Super-Resolution

W-Net: A Facial Feature-Guided Face Super-Resolution Network

Non-local self-attention network for image super-resolution

Multi-Scale Feature Fusion and Structure-Preserving Network for Face Super-Resolution

Attention-Driven Graph Neural Network for Deep Face Super-Resolution

A Composite Network Model for Face Super-Resolution with Multi-Order Head Attention Facial Priors

A Two-Stage Attentive Network for Single Image Super-Resolution

Edge-Enhanced with Feedback Attention Network for Image Super-Resolution