Abstract:Deep learning has emerged as the predominant approach for multispectral and hyperspectral image fusion. However, most fusion networks are typically trained and validated on hyperspectral and multispectral image pairs generated from the same hyperspectral images, with degradation simulations inconsistent with real situations and relatively limited volumes of images. When transferring a pretrained multispectral and hyperspectral image fusion model from ground or airborne images to spaceborne images, it encounters a larger dataset and more complex spatial-spectral degradation, leading to spectral distortions and spatial artifacts in the fused images. In this article, the challenges associated with the transfer are addressed through the introduction of a self-supervised multispectral and hyperspectral image fusion unrolling network for spaceborne imagery, termed as MH-FUNet. MH-FUNet adopts a self-supervised paradigm to learn a robust mapping from spaceborne data. It utilizes a deep unrolling network to iteratively refine fusion results from coarse to fine. To account for spatial scale differences between the self-supervised training and test datasets, a multiscale fusion strategy is introduced. This strategy is combined with spectral and spatial attention mechanisms to restore spatial and spectral details. Additionally, a gradient constraint unit is proposed to maintain spatial consistency when up-scaling low-resolution hyperspectral imagery. Performance evaluation of the proposed method is conducted against state-of-the-art fusion techniques on both simulated Chikusei dataset and the proposed real WHU-MHF dataset, which consists of simultaneously observed hyperspectral and multispectral image pairs. MH-FUNet outperforms existing methods across all datasets, demonstrating superior performance in spaceborne multispectral and hyperspectral image fusion experiments.

Multimodal Image Fusion Via Self-Supervised Transformer

Multi-Modal Image Fusion via Self-Supervised Transformer

A Self-Supervised Spaceborne Multispectral and Hyperspectral Image Fusion Unrolling Network

LeGFusion: Locally Enhanced Global Learning for Multimodal Image Fusion

LeGFusion: Locally-enhanced Global Learning for Multi-Modal Image Fusion

MFST: Multi-Modal Feature Self-Adaptive Transformer for Infrared and Visible Image Fusion

Equivariant Multi-Modality Image Fusion

Multi-modal medical image fusion based on densely-connected high-resolution CNN and hybrid transformer

A Self-Supervised Residual Feature Learning Model for Multifocus Image Fusion

A layer-wise fusion network incorporating self-supervised learning for multimodal MR image synthesis

Multimodal Fusion Method Based on Self-Attention Mechanism

Incomplete Multimodal Learning for Remote Sensing Data Fusion

TransFuse: A Unified Transformer-based Image Fusion Framework using Self-supervised Learning

Multi-Modal Image Fusion Via Deep Laplacian Pyramid Hybrid Network

MDC-RHT: Multi-Modal Medical Image Fusion via Multi-Dimensional Dynamic Convolution and Residual Hybrid Transformer

MEFusion: Unsupervised Mutual Enhancement for Multimodal Image Fusion

Trans2Fuse: Empowering image fusion through self-supervised learning and multi-modal transformations via transformer networks

Self-MI: Efficient Multimodal Fusion via Self-Supervised Multi-Task Learning with Auxiliary Mutual Information Maximization

MIMF: Mutual Information-Driven Multimodal Fusion

Multimodal Token Fusion for Vision Transformers

MMFormer: Multimodal Transformer Using Multiscale Self-Attention for Remote Sensing Image Classification