A Locally Enhanced Transformer Network for Hyperspectral Image Classification

Wei Xiao,Shaoguang Huang,Mbulisi Sibanda,Elhadi Adam,Hongyan Zhang
DOI: https://doi.org/10.1109/igarss53475.2024.10641999
2024-01-01
Abstract:Convolutional neural networks (CNN) have demonstrated excellent performance in the classification of hyperspectral image (HSI). However, CNN-based models often fail to capture long-range contextual information due to the limited receptive fields. In this paper, we propose a locally enhanced transformer network for HSI classification. Firstly, we propose a multi-branch spatial-spectral token (SST) module based on CNN to transform HSI into the spatial-spectral tokens, facilitating the reduction of information loss during tokenization. Secondly, after SST we propose a dual-branch transformer module, which consists of a global transformer and a locally enhanced transformer, to capture the global and local spatial features of HSI. Particularly, in the local branch we develop an improved multi-head self-attention (IMSA) by incorporating the neighbourhood information derived from super-pixel segmentation to improve the local feature extraction ability of the conventional transformer. Experimental results on benchmark datasets demonstrate that the proposed method achieves better performance over the state-of-the-art.
What problem does this paper attempt to address?