Abstract:Remote sensing image fusion aims to generate a high-resolution multi/hyper-spectral image by combining a high-resolution image with limited spectral data and a low-resolution image rich in spectral information. Current deep learning (DL) methods typically employ convolutional neural networks (CNNs) or Transformers for feature extraction and information integration. While CNNs are efficient, their limited receptive fields restrict their ability to capture global context. Transformers excel at learning global information but are computationally expensive. Recent advancements in the state space model (SSM), particularly Mamba, present a promising alternative by enabling global perception with low complexity. However, the potential of SSM for information integration remains largely unexplored. Therefore, we propose FusionMamba, an innovative method for efficient remote sensing image fusion. Our contributions are twofold. First, to effectively merge spatial and spectral features, we expand the single-input Mamba block to accommodate dual inputs, creating the FusionMamba block, which serves as a plug-and-play solution for information integration. Second, we incorporate Mamba and FusionMamba blocks into an interpretable network architecture tailored for remote sensing image fusion. Our designs utilize two U-shaped network branches, each primarily composed of four-directional Mamba blocks, to extract spatial and spectral features separately and hierarchically. The resulting feature maps are sufficiently merged in an auxiliary network branch constructed with FusionMamba blocks. Furthermore, we improve the representation of spectral information through an enhanced channel attention module. Quantitative and qualitative valuation results across six datasets demonstrate that our method achieves SOTA performance. The code is available at <a class="link-external link-https" href="https://github.com/PSRben/FusionMamba" rel="external noopener nofollow">this https URL</a>.

Why mamba is effective? Exploit Linear Transformer-Mamba Network for Multi-Modality Image Fusion

FusionMamba: Dynamic Feature Enhancement for Multimodal Image Fusion with Mamba

MambaDFuse: A Mamba-based Dual-phase Model for Multi-modality Image Fusion

Multi-Modal Image Fusion Via Deep Laplacian Pyramid Hybrid Network

FusionMamba: Efficient Image Fusion with State Space Model

Fusion-Mamba for Cross-modality Object Detection

FusionMamba: Efficient Remote Sensing Image Fusion with State Space Model

A Novel State Space Model with Local Enhancement and State Sharing for Image Fusion

MBHFuse: A Multi- Branch Heterogeneous Global and Local Infrared and Visible Image Fusion with Differential Convolutional Amplification Features

AlignMamba: Enhancing Multimodal Mamba with Local and Global Cross-modal Alignment

MFMamba: A Mamba-Based Multi-Modal Fusion Network for Semantic Segmentation of Remote Sensing Images

MambaVision: A Hybrid Mamba-Transformer Vision Backbone

MFT: Multi-scale Fusion Transformer for Infrared and Visible Image Fusion

Demystify Mamba in Vision: A Linear Attention Perspective

MDC-RHT: Multi-Modal Medical Image Fusion via Multi-Dimensional Dynamic Convolution and Residual Hybrid Transformer

MAMFuse: Multi-modality Image Fusion with Multiscale Attention Mechanism

MSFMamba: Multi-Scale Feature Fusion State Space Model for Multi-Source Remote Sensing Image Classification

TMFIF:Transformer-based Multi-Focus Image Fusion

CMFuse: Cross-Modal Features Mixing Via Convolution and MLP for Infrared and Visible Image Fusion

MACTFusion: Lightweight Cross Transformer for Adaptive Multimodal Medical Image Fusion

MxT: Mamba x Transformer for Image Inpainting