Abstract:Data-hungry HSI classification methods require high-quality labeled HSIs, which are often costly to obtain. This characteristic limits the performance potential of data-driven methods when dealing with limited annotated samples. Bridging the domain gap between data acquired from different sensors allows us to utilize abundant labeled data across sensors to break this bottleneck. In this paper, we propose a novel Attention-Gated Tuning (AGT) strategy and a triplet-structured transformer model, Tri-Former, to address this issue. The AGT strategy serves as a bridge, allowing us to leverage existing labeled HSI datasets, even RGB datasets to enhance the performance on new HSI datasets with limited samples. Instead of inserting additional parameters inside the basic model, we train a lightweight auxiliary branch that takes intermediate features as input from the basic model and makes predictions. The proposed AGT resolves conflicts between heterogeneous and even cross-modal data by suppressing the disturbing information and enhances the useful information through a soft gate. Additionally, we introduce Tri-Former, a triplet-structured transformer with a spectral-spatial separation design that enhances parameter utilization and computational efficiency, enabling easier and flexible fine-tuning. Comparison experiments conducted on three representative HSI datasets captured by different sensors demonstrate the proposed Tri-Former achieves better performance compared to several state-of-the-art methods. Homologous, heterologous and cross-modal tuning experiments verified the effectiveness of the proposed AGT. Code has been released at: \href{<a class="link-external link-https" href="https://github.com/Cecilia-xue/AGT" rel="external noopener nofollow">this https URL</a>}{<a class="link-external link-https" href="https://github.com/Cecilia-xue/AGT" rel="external noopener nofollow">this https URL</a>}.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is **the problem of cross - sensor data domain gap in hyperspectral image (HSI) classification**. Specifically, the paper aims to overcome the following challenges: 1. **Scarcity of labeled data**: Hyperspectral image classification methods usually require high - quality labeled data, but obtaining such data is both time - consuming and expensive. This limits the performance potential of data - driven methods when dealing with a limited number of labeled samples. 2. **Differences in cross - sensor and cross - modal data**: There are significant structural and feature differences between data obtained by different sensors, resulting in poor performance of transfer learning when directly using data from other sensors or modalities. To solve these problems, the authors propose two main innovations: ### 1. Attention - Gated Tuning (AGT) Strategy The AGT strategy aims to reconcile the conflicts between heterogeneous and cross - modal data by introducing a lightweight auxiliary branch. Specifically: - **Suppress interfering information**: Suppress irrelevant noise information through a soft gate mechanism. - **Enhance useful information**: Enhance the semantic information from the base model, thereby effectively improving the model performance. - **Utilize multi - source data**: It can not only utilize existing hyperspectral datasets, but also RGB datasets to enhance the performance on new HSI datasets. ### 2. Tri - Former Model Tri - Former is a triple - structure - based Transformer model with the following characteristics: - **Spectral - spatial separation design**: Improve parameter utilization and computational efficiency by separating spectral and spatial information processing. - **3D convolution enhancement**: Add 3D convolution layers to the model to strengthen the structural information and stabilize the training process. - **Flexible fine - tuning ability**: Make the model easier and more flexible to fine - tune, especially suitable for the case of a limited number of labeled samples. ### Summary The main contributions of the paper include: 1. Proposing a new Attention - Gated Tuning (AGT) strategy to solve the conflicts in cross - sensor and cross - modal data. 2. Designing a triple - structure hyperspectral image classification Transformer model (Tri - Former), whose flexible architecture can efficiently learn features from a limited number of training samples. 3. Establishing a connection between RGB and HSI datasets, allowing the use of rich RGB - labeled data to enhance HSI classification performance, especially when the labeled HSI data is limited. Through these innovations, the paper demonstrates the superior performance of its method on multiple representative hyperspectral image datasets and verifies the effectiveness of the AGT strategy.

Bridging Sensor Gaps via Attention Gated Tuning for Hyperspectral Image Classification

Attention in Attention for Hyperspectral with High Spatial Resolution (H) Image Classification

Hyperspectral Image Classification Based on Multibranch Attention Transformer Networks

Incorporating Attention Mechanism and Graph Regularization into Cnns for Hyperspectral Image Classification

When Multigranularity Meets Spatial–Spectral Attention: A Hybrid Transformer for Hyperspectral Image Classification

Multi-Scale Residual Spectral–Spatial Attention Combined with Improved Transformer for Hyperspectral Image Classification

Cnn-assisted multi-hop graph attention network for hyperspectral image classification

Spectral-Spatial Center-Aware Bottleneck Transformer for Hyperspectral Image Classification

Hyperspectral Image Classification via Spectral Pooling and Hybrid Transformer

Spectral Spatial Neighborhood Attention Transformer for Hyperspectral Image Classification

Bridging CNN and Transformer With Cross-Attention Fusion Network for Hyperspectral Image Classification

Spectral-Spatial-Sensorial Attention Network with Controllable Factors for Hyperspectral Image Classification

Double Attention Transformer for Hyperspectral Image Classification

Hybrid Spectral Denoising Transformer with Guided Attention

Dual-Stream Discriminative Attention Network for Cross-Scene Hyperspectral Image Classification

Adaptive Learnable Spectral–Spatial Fusion Transformer for Hyperspectral Image Classification

Hyperspectral Image Classification Based on Superpixel Feature Subdivision and Adaptive Graph Structure

MHIAIFormer: Multihead Interacted and Adaptive Integrated Transformer With Spatial-Spectral Attention for Hyperspectral Image Classification

A Hyperspectral Image Classification Method Based on Adaptive Spectral Spatial Kernel Combined with Improved Vision Transformer

Joint Classification of Hyperspectral and LiDAR Data Based on Adaptive Gating Mechanism and Learnable Transformer

Expeditious Hyperspectral Image Classification With Inner and Outer Layered Transformer Using Feature Enhancement