SVAFormer: Integrating Random and Hierarchical Spectral View Attention for Hyperspectral Image Classification

Ning Chen,Zhou Huang,Xia Yue,Anfeng Liu,Meiyun Lu,Jun Yue,Leyuan Fang
DOI: https://doi.org/10.1109/tgrs.2024.3509478
IF: 8.2
2024-01-01
IEEE Transactions on Geoscience and Remote Sensing
Abstract:Recently, Hyperspectral Image (HSI) classification methods based on Transformers have developed rapidly. However, these methods still face challenges in handling the widely varying scales and diverse spatial distribution patterns commonly found in HSI. To address these issues, this paper proposes a simple yet novel HSI classification framework named the Spectral View Attention Transformer (SVAFormer). Built on the Transformer mechanism, this framework enhances the integration of spectral and spatial features by allowing the Spectral Token, corresponding to the pixel to be classified, to access spatial neighborhood information from multiple perspectives and levels. Specifically, the framework employs random masking techniques to provide Spectral Tokens with spatial neighborhood information from different viewpoints, enabling the model to handle diverse land-cover distribution patterns. Additionally, the framework introduces a Spectral Token Aware Pooling Layer between adjacent Transformer Blocks, which preserves the central role of Spectral Tokens while progressively expanding the spatial scale represented by each token. This reduces the Transformer’s focus on spatially fragmented information and enables Spectral Tokens to concentrate on spatial neighborhood information at various levels and scales. The key characteristic of this framework is its ability to effectively handle land-cover features of different scales and shapes by strengthening the fusion of spectral and spatial characteristics. Experimental results on multiple public datasets demonstrate that our framework outperforms previous state-of-the-art methods. For the sake of reproducibility, the source code of SVAFormer will be publicly available at https://github.com/chenning0115/SVAFormer.
What problem does this paper attempt to address?