ELS2T: Efficient Lightweight Spectral-Spatial Transformer for Hyperspectral Image Classification

Shichao Zhang,Jiahua Zhang,Xiaopeng Wang,Jingwen Wang,Zhenjiang Wu
DOI: https://doi.org/10.1109/tgrs.2023.3299442
IF: 8.2
2023-01-01
IEEE Transactions on Geoscience and Remote Sensing
Abstract:In recent years, convolutional neural networks (CNNs) have been extensively used in hyperspectral image (HSI) classification tasks and achieved desirable performance. However, CNNs expand the receptive field by stacking convolutional layers and pooling layers, but the actual receptive field is still insufficient, so it is hard to capture the global representations of HSIs. In addition, the existing CNN models used for HSI classification have low-computational efficiency, and cannot effectively fuse spectral and spatial information. To alleviate the above limitations, this article proposes a novel efficient lightweight spectral–spatial transformer (ELS2T) specially designed for HSI classification. The transformer model can model the long-distance feature dependencies in the HSI cube. First, we design a global multiscale attention module (GMAM) to effectively highlight the useful information and weaken the useless information. Second, considering the different importance of spectral and spatial information for classification tasks, an adaptive feature fusion module (AFFM) is proposed to adaptively fuse the acquired spectral and spatial information. Finally, to improve the computational efficiency, we design the lightweight separable spatial–spectral self-attention ( $\text{S}^{3}\text{A}$ ) module to replace the multihead self-attention (MHSA) module in the transformer encoder. Experimental results on the four well-known hyperspectral datasets show that our model is superior to the other state-of-the-art deep learning (DL) methods in both computational efficiency and classification performance.
imaging science & photographic technology,remote sensing,engineering, electrical & electronic,geochemistry & geophysics
What problem does this paper attempt to address?