Global and Local Attention-Based Transformer for Hyperspectral Image Change Detection

Ziyi Wang,Feng Gao,Junyu Dong,Qian Du
2024-11-21
Abstract:Recently Transformer-based hyperspectral image (HSI) change detection methods have shown remarkable performance. Nevertheless, existing attention mechanisms in Transformers have limitations in local feature representation. To address this issue, we propose Global and Local Attention-based Transformer (GLAFormer), which incorporates a global and local attention module (GLAM) to combine high-frequency and low-frequency signals. Furthermore, we introduce a cross-gating mechanism, called cross-gated feed-forward network (CGFN), to emphasize salient features and suppress noise interference. Specifically, the GLAM splits attention heads into global and local attention components to capture comprehensive spatial-spectral features. The global attention component employs global attention on downsampled feature maps to capture low-frequency information, while the local attention component focuses on high-frequency details using non-overlapping window-based local attention. The CGFN enhances the feature representation via convolutions and cross-gating mechanism in parallel paths. The proposed GLAFormer is evaluated on three HSI datasets. The results demonstrate its superiority over state-of-the-art HSI change detection methods. The source code of GLAFormer is available at \url{<a class="link-external link-https" href="https://github.com/summitgao/GLAFormer" rel="external noopener nofollow">this https URL</a>}.
Image and Video Processing
What problem does this paper attempt to address?
The problems that this paper attempts to solve are the two main limitations of existing methods in hyperspectral image (HSI) change detection: 1. **Insufficient local feature representation**: Existing Transformer - based methods, when dealing with hyperspectral image change detection, often focus more on global features and overlook local features. This leads to the loss of certain local features, thus affecting the performance of change detection. 2. **Limited non - linear feature transformation ability**: Traditional feed - forward networks (FFN) have limited non - linear feature transformation ability when processing the output of Transformer, and are easily affected by noise interference. To solve these problems, the author proposes a new model named GLAFormer. This model enhances local feature representation and non - linear feature transformation ability by introducing the global and local attention module (GLAM) and the cross - gated feed - forward network (CGFN). Specifically: - **Global and local attention module (GLAM)**: This module divides the attention heads into two parts, global attention and local attention, to capture low - frequency and high - frequency signals respectively, thereby achieving a more comprehensive spatial - spectral feature representation. - **Cross - gated feed - forward network (CGFN)**: This network enhances feature representation through parallel paths of convolution and cross - gated mechanisms, amplifies important information and suppresses noise interference. Verified by experiments on three hyperspectral image datasets, GLAFormer is superior to the existing advanced methods in terms of overall accuracy (OA) and Kappa coefficient.