Abstract:Hyperspectral image (HSI) super-resolution is a practical and challenging task as it requires the reconstruction of a large number of spectral bands. Achieving excellent reconstruction results can greatly benefit subsequent downstream tasks. The current mainstream hyperspectral super-resolution methods mainly utilize 3D convolutional neural networks (3D CNN) for design. However, the commonly used small kernel size in 3D CNN limits the model’s receptive field, preventing it from considering a wider range of contextual information. Though the receptive field could be expanded by enlarging the kernel size, it results in a dramatic increase in model parameters. Furthermore, the popular vision transformers designed for natural images are not suitable for processing HSI. This is because HSI exhibits sparsity in the spatial domain, which can lead to significant computational resource waste when using self-attention. In this paper, we design a hybrid architecture called HyFormer, which combines the strengths of CNN and transformer for hyperspectral super-resolution. The transformer branch enables intra-spectra interaction to capture fine-grained contextual details at each specific wavelength. Meanwhile, the CNN branch facilitates efficient inter-spectra feature extraction among different wavelengths while maintaining a large receptive field. Specifically, in the transformer branch, we propose a novel Grouping-Aggregation transformer (GAT), comprising grouping self-attention (GSA) and aggregation self-attention (ASA). The GSA is employed to extract diverse fine-grained features of targets, while the ASA facilitates interaction among heterogeneous textures allocated to different channels. In the CNN branch, we propose a Wide-Spanning Separable 3D Attention (WSSA) to enlarge the receptive field while keeping a low parameter number. Building upon WSSA, we construct a wide-spanning CNN module to efficiently extract inter-spectra features. Extensive experiments demonstrate the superior performance of our HyFormer.

MSDformer: Multiscale Deformable Transformer for Hyperspectral Image Super-Resolution

Cognitively-Inspired Multi-Scale Spectral-Spatial Transformer for Hyperspectral Image Super-Resolution

U-Shape Spectral-Transformer for Robust Fusion Based Hyperspectral Super-Resolution

Learning Multi-Modal Cross-Scale Deformable Transformer Network for Unregistered Hyperspectral Image Super-resolution

Hyperspectral Image Super-Resolution Via Convolutional Neural Network

Remote Sensing Hyperspectral Image Super-Resolution via Multidomain Spatial Information and Multiscale Spectral Information Fusion

MSCSCformer: Multiscale Convolutional Sparse Coding-Based Transformer for Pansharpening

Hyperspectral Image Spatial Super-Resolution Via 3d Full Convolutional Neural Network

MSCSCformer: Multi-scale Convolutional Sparse Coding-based Transformer for Pansharpening

Multi-scale Spectral-Spatial Dual-Transformer Network for Hyperspectral Image Classification

SSAformer: Spatial–Spectral Aggregation Transformer for Hyperspectral Image Super-Resolution

Cross-Scope Spatial-Spectral Information Aggregation for Hyperspectral Image Super-Resolution

Interactformer: Interactive Transformer and CNN for Hyperspectral Image Super-Resolution

Deep Hyperspectral Image Super-Resolution with Transformers

Combining global receptive field and spatial spectral information for single-image hyperspectral super-resolution

ESSAformer: Efficient Transformer for Hyperspectral Image Super-resolution

Single-Image Super Resolution for RGB Remote Sensing Imagery Via Multi-Scale CNN-Transformer Feature Fusion

Spectral Superresolution Using Transformer with Convolutional Spectral Self-Attention

HyFormer: Hybrid Grouping-Aggregation Transformer and Wide-Spanning CNN for Hyperspectral Image Super-Resolution

Single-Image Superresolution for RGB Remote Sensing Imagery via Multiscale CNN-Transformer Feature Fusion