Double Attention Transformer for Hyperspectral Image Classification

Ping Tang,Meng Zhang,Zhihui Liu,Rong Song
DOI: https://doi.org/10.1109/lgrs.2023.3248582
IF: 5.343
2023-03-11
IEEE Geoscience and Remote Sensing Letters
Abstract:Convolutional neural networks (CNNs) have become one of the most popular tools to tackle hyperspectral image (HSI) classification tasks. However, CNN suffers from the long-range dependencies problem, which may degrade the classification performance. To address this issue, this letter proposes a transformer-based backbone network for HSI classification. The core component is a newly designed double-attention transformer encoder (DATE), which contains two self-attention modules, termed spectral attention module (SPE) and spatial attention module (SPA). SPE extracts the global dependency among spectral bands, and SPA mines the local features of spatial correlation information among pixels. The local spatial tokens and the global spectral token are fused together and updated by SPA. In this way, DATE can not only capture the global dependence among spectral bands but also extract the local spatial information, which greatly improves the classification performance. To reduce the possible information loss as the network depth increases, a new skip connection mechanism is devised for cross-layer feature fusion. Experimental results in several datasets indicate that the new algorithm holds very competitive classification performance compared to the state-of-the-art methods.
imaging science & photographic technology,remote sensing,engineering, electrical & electronic,geochemistry & geophysics
What problem does this paper attempt to address?