Abstract:In recent years, methods based on deep convolutional neural networks (CNNs) have dominated the classification task of hyperspectral images. Although CNN-based HSI classification methods have the advantages of spatial feature extraction, HSI images are characterized by approximately continuous spectral information, usually containing hundreds of spectral bands. CNN cannot mine and represent the sequence properties of spectral features well, and the transformer model of attention mechanism proves its advantages in processing sequence data. This study proposes a new spectral spatial kernel combined with the improved Vision Transformer (ViT) to jointly extract spatial spectral features to complete classification task. First, the hyperspectral data are dimensionally reduced by PCA; then, the shallow features are extracted with an spectral spatial kernel, and the extracted features are input into the improved ViT model. The improved ViT introduces a re-attention mechanism and a local mechanism based on the original ViT. The re-attention mechanism can increase the diversity of attention maps at different levels. The local mechanism is introduced into ViT to make full use of the local and global information of the data to improve the classification accuracy. Finally, a multi-layer perceptron is used to obtain the classification result. Among them, the Focal Loss function is used to increase the loss weight of small-class samples and difficult-to-classify samples in HSI data samples and reduce the loss weight of easy-to-classify samples, so that the network can learn more useful hyperspectral image information. In addition, using the Apollo optimizer to train the HSI classification model to better update and compute network parameters that affect model training and model output, thereby minimizing the loss function. We evaluated the classification performance of the proposed method on four different datasets, and achieved good classification results on urban land object classification, crop classification and mineral classification, respectively. Compared with the state-of-the-art backbone network, the method achieves a significant improvement and achieves very good classification accuracy.

Asymmetric Residual Transformer for Hyperspectral Image Classification Using Limited Training Samples

A Novel Transformer Network with a CNN-Enhanced Cross-Attention Mechanism for Hyperspectral Image Classification

A hybrid convolution transformer for hyperspectral image classification

Adaptive Learnable Spectral–Spatial Fusion Transformer for Hyperspectral Image Classification

Multi-Scale Residual Spectral–Spatial Attention Combined with Improved Transformer for Hyperspectral Image Classification

Dual-Branch Adaptive Convolutional Transformer for Hyperspectral Image Classification

Hyperspectral Image Transformer Classification Networks

SpectralFormer: Rethinking Hyperspectral Image Classification With Transformers

Hierarchical Attention Transformer for Hyperspectral Image Classification

MHIAIFormer: Multihead Interacted and Adaptive Integrated Transformer With Spatial-Spectral Attention for Hyperspectral Image Classification

RDTN: Residual Densely Transformer Network for hyperspectral image classification

MHIAIFormer: Multi-Head Interacted and Adaptive Integrated Transformer with Spatial-Spectral Attention for Hyperspectral Image Classification

Spectral-Spatial Attention Transformer with Dense Connection for Hyperspectral Image Classification

Expeditious Hyperspectral Image Classification With Inner and Outer Layered Transformer Using Feature Enhancement

Adaptive Pixel-Level and Superpixel-Level Feature Fusion Transformer for Hyperspectral Image Classification

A Hyperspectral Image Classification Method Based on Adaptive Spectral Spatial Kernel Combined with Improved Vision Transformer

Hyperspectral Image Classification via Spectral Pooling and Hybrid Transformer

SWFormer: Stochastic Windows Convolutional Transformer for Hybrid Modality Hyperspectral Classification

When Multigranularity Meets Spatial–Spectral Attention: A Hybrid Transformer for Hyperspectral Image Classification

Hyper-ES2T: Efficient Spatial–Spectral Transformer for the classification of hyperspectral remote sensing images