Abstract:Taking into account the limitations of optical imaging, image acquisition equipment is usually designed to make a trade-off between spatial information and spectral information. Hyperspectral image(HSI) can finely identify and classify imaging objects owing to its rich spectral information, while multispectral image(MSI) can provide fine geometric features because of its sufficient spatial information. Hence, fusing HSI and MSI to achieve information complementarity has become a prevalent manner, which increases the reliability and accuracy of the information obtained. However, unlike traditional optical multi-focus image fusion and pan-sharpening of MSI, existing HSI and MSI fusion methods still face with problems in achieving cross-modality information interaction and lack effective utilization of spatial location information. To solve the above problems and to achieve more effective information integration between HSI and MSI, this paper proposes a novel multi-hierarchical cross transformer for hyperspectral and multispectral image fusion (MCT-Net). The proposed MCT-Net consists of two components: (1) a multi-hierarchical cross-modality interacting module (MCIM), which first extracts the deep multi-scale features of HSI and MSI, and then performs cross-modality information interaction between them at identical scales by applying a multi-hierarchical cross transformer (MCT), to reconstruct the spectral information lacking in MSI and the spatial information lacking in HSI; (2) a feature aggregation reconstruction module (FARM) which combines features from MCIM, uses strip convolution to further restore edge features, and reconstructs the fusion results through cascaded upsampling. We conduct comparative experiments on five mainstream HSI datasets to prove the effectiveness and superiority of the proposed method, including the Pavia Center, Pavia University, Urban, Botswana, and Washington DC Mall. For instance, on the Washington DC Mall dataset, compared with the state-of-the-art(SOTA) method in the comparison algorithms, our method improves PSNR by 18.52% and reduces RMSE, ERGAS and SAM by 56.63%, 56.90% and 58.58%, respectively. The source code for MCT-Net can be downloaded from https://github.com/wxy11-27/MCT-Net .

Reciprocal transformer for hyperspectral and multispectral image fusion

Hyperspectral and Multispectral Image Fusion Based on Deep Attention Network.

A Novel Transformer Network with a CNN-Enhanced Cross-Attention Mechanism for Hyperspectral Image Classification

HMF-Former: Spatio-Spectral Transformer for Hyperspectral and Multispectral Image Fusion

HMFT: Hyperspectral and Multispectral Image Fusion Super-Resolution Method Based on Efficient Transformer and Spatial-Spectral Attention Mechanism

MCT-Net: Multi-hierarchical cross transformer for hyperspectral and multispectral image fusion

Hierarchical Spectral–Spatial Transformer for Hyperspectral and Multispectral Image Fusion

Hyperspectral and multispectral images fusion based on pyramid swin transformer

Unsupervised Hybrid Network of Transformer and CNN for Blind Hyperspectral and Multispectral Image Fusion

Fusformer: A Transformer-based Fusion Approach for Hyperspectral Image Super-resolution

Mutually Beneficial Transformer for Multimodal Data Fusion

Adaptive Learnable Spectral–Spatial Fusion Transformer for Hyperspectral Image Classification

MIMO-SST: Multi-Input Multi-Output Spatial-Spectral Transformer for Hyperspectral and Multispectral Image Fusion

A multimodal hyper-fusion transformer for remote sensing image classification

Attention and transformer complementary fusion network for hyperspectral image spectral reconstruction

Double-branch feature fusion transformer for hyperspectral image classification

A Joint Convolutional Cross ViT Network for Hyperspectral and Light Detection and Ranging Fusion Classification

Multimodal Fusion Transformer for Remote Sensing Image Classification

MHIAIFormer: Multihead Interacted and Adaptive Integrated Transformer With Spatial-Spectral Attention for Hyperspectral Image Classification

MHIAIFormer: Multi-Head Interacted and Adaptive Integrated Transformer with Spatial-Spectral Attention for Hyperspectral Image Classification

Advancing Hyperspectral and Multispectral Image Fusion: An Information-Aware Transformer-Based Unfolding Network