Mamba-in-Mamba: Centralized Mamba-Cross-Scan in Tokenized Mamba Model for Hyperspectral Image Classification

Weilian Zhou,Sei-Ichiro Kamata,Haipeng Wang,Man-Sing Wong,Huiying

2024-07-13

Abstract:Hyperspectral image (HSI) classification is pivotal in the remote sensing (RS) field, particularly with the advancement of deep learning techniques. Sequential models, adapted from the natural language processing (NLP) field such as Recurrent Neural Networks (RNNs) and Transformers, have been tailored to this task, offering a unique viewpoint. However, several challenges persist 1) RNNs struggle with centric feature aggregation and are sensitive to interfering pixels, 2) Transformers require significant computational resources and often underperform with limited HSI training samples, and 3) Current scanning methods for converting images into sequence-data are simplistic and inefficient. In response, this study introduces the innovative Mamba-in-Mamba (MiM) architecture for HSI classification, the first attempt of deploying State Space Model (SSM) in this task. The MiM model includes 1) A novel centralized Mamba-Cross-Scan (MCS) mechanism for transforming images into sequence-data, 2) A Tokenized Mamba (T-Mamba) encoder that incorporates a Gaussian Decay Mask (GDM), a Semantic Token Learner (STL), and a Semantic Token Fuser (STF) for enhanced feature generation and concentration, and 3) A Weighted MCS Fusion (WMF) module coupled with a Multi-Scale Loss Design to improve decoding efficiency. Experimental results from three public HSI datasets with fixed and disjoint training-testing samples demonstrate that our method outperforms existing baselines and state-of-the-art approaches, highlighting its efficacy and potential in HSI applications.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper primarily addresses the challenges in the task of hyperspectral image (HSI) classification by proposing a new architecture—Mamba-in-Mamba (MiM)—to improve the performance and efficiency of existing methods. #### Main Issues: 1. **Limitations of RNN**: - RNNs are susceptible to the influence of noisy pixels when processing hyperspectral images and are computationally inefficient when handling larger image patches. 2. **Limitations of Transformer**: - Transformers require substantial computational resources and perform poorly when training samples are limited. - Transformers lack the ability to effectively capture local spatial features. #### Solutions: 1. **Innovative Scanning Mechanism (Centralized Mamba-Cross-Scan, MCS)**: - A new scanning method is proposed that can convert image patches into sequences in multiple directions, thereby better capturing the features of the central pixel. 2. **Tokenized Mamba Encoder (T-Mamba Encoder)**: - Combines Gaussian Decay Mask (GDM), Semantic Token Learner (STL), and Semantic Token Fuser (STF) to enhance feature generation and concentration. 3. **Weighted MCS Fusion Module (WMF)**: - Combined with a multi-scale loss design to improve model training efficiency. Through these methods, the paper demonstrates that this approach achieves highly competitive and even state-of-the-art performance on 4 public hyperspectral image datasets, proving its effectiveness and potential in hyperspectral image classification.

Mamba-in-Mamba: Centralized Mamba-Cross-Scan in Tokenized Mamba Model for Hyperspectral Image Classification

Spectral-Spatial Mamba for Hyperspectral Image Classification

3DSS-Mamba: 3D-Spectral-Spatial Mamba for Hyperspectral Image Classification

S^2Mamba: A Spatial-spectral State Space Model for Hyperspectral Image Classification

State Space Models Meet Transformers for Hyperspectral Image Classification

SpectralMamba: Efficient Mamba for Hyperspectral Image Classification

S$^2$Mamba: A Spatial-spectral State Space Model for Hyperspectral Image Classification

Multi-head Spatial-Spectral Mamba for Hyperspectral Image Classification

Cross-Scan Mamba with Masked Training for Robust Spectral Imaging

HSIMamba: Hyperpsectral Imaging Efficient Feature Learning with Bidirectional State Space for Classification

DualMamba: A Lightweight Spectral-Spatial Mamba-Convolution Network for Hyperspectral Image Classification

IGroupSS-Mamba: Interval Group Spatial-Spectral Mamba for Hyperspectral Image Classification

MSFMamba: Multi-Scale Feature Fusion State Space Model for Multi-Source Remote Sensing Image Classification

WaveMamba: Spatial-Spectral Wavelet Mamba for Hyperspectral Image Classification

RSMamba: Remote Sensing Image Classification with State Space Model

Bidirectional Mamba with Dual-Branch Feature Extraction for Hyperspectral Image Classification

MHS-VM: Multi-Head Scanning in Parallel Subspaces for Vision Mamba

Spatial-Spectral Morphological Mamba for Hyperspectral Image Classification

GraphMamba: An Efficient Graph Structure Learning Vision Mamba for Hyperspectral Image Classification

HSIDMamba: Exploring Bidirectional State-Space Models for Hyperspectral Denoising