Abstract:Deep learning based methods, such as Convolution Neural Network (CNN), have demonstrated their efficiency in hyperspectral image (HSI) classification. These methods can automatically learn spectral-spatial discriminative features within local patches. However, for each pixel in an HSI, it is not only related to its nearby pixels but also has connections to pixels far away from itself. Therefore, to incorporate the long-range contextual information, a deep fully convolutional network (FCN) with an efficient non-local module, named ENL-FCN, is proposed for HSI classification. In the proposed framework, a deep FCN considers an entire HSI as input and extracts spectral-spatial information in a local receptive field. The efficient non-local module is embedded in the network as a learning unit to capture the long-range contextual information. Different from the traditional non-local neural networks, the long-range contextual information is extracted in a specially designed criss-cross path for computation efficiency. Furthermore, by using a recurrent operation, each pixel's response is aggregated from all pixels of HSI. The benefits of our proposed ENL-FCN are threefold: 1) the long-range contextual information is incorporated effectively, 2) the efficient module can be freely embedded in a deep neural network in a plug-and-play fashion, and 3) it has much fewer learning parameters and requires less computational resources. The experiments conducted on three popular HSI datasets demonstrate that the proposed method achieves state-of-the-art classification performance with lower computational cost in comparison with several leading deep neural networks for HSI.

What problem does this paper attempt to address?

### Problems the paper attempts to solve The paper aims to solve the problem of how to efficiently model long - distance contextual information in hyperspectral image (HSI) classification. Specifically, existing deep - learning - based methods, such as convolutional neural networks (CNN), can automatically extract spectral - spatial discriminative features from local regions, but the information extracted by these methods is still limited to a fixed small area and fails to fully utilize non - local information. For each pixel in HSI, it is not only related to its neighboring pixels, but also has a connection with distant pixels. For example, the same ground object type may appear in different positions in the scene. Therefore, in order to extract long - distance contextual information, the paper proposes an all - convolutional network with an efficient non - local module (ENL - FCN) to improve the performance and efficiency of HSI classification. ### Main contributions 1. **End - to - end trainable deep - learning framework**: By combining the all - convolutional network (FCN) with an efficient non - local module, this framework can simultaneously extract local and non - local information. 2. **Efficient non - local module**: Compared with traditional non - local modules, this module is more efficient in terms of computation and memory usage, and multiple modules can be stacked to further improve performance. 3. **Significantly reduce computational resources**: Compared with the original non - local module, the efficient non - local module saves more than three times the computational memory and one hundred times the learning parameters. Experimental results show that the proposed ENL - FCN achieves state - of - the - art classification performance on multiple HSI datasets with lower computational costs. ### Method overview 1. **All - convolutional network (FCN)**: As the backbone network, FCN is responsible for extracting local spectral - spatial information from the entire HSI. 2. **Efficient non - local module**: This module calculates the relationship between pixels through the criss - cross path, thereby efficiently capturing long - distance contextual information. Specific operations include: - Using 1×1 convolution kernels to generate feature maps \( Q \) and \( K \). - Calculating the non - local correlation between pixels through the criss - cross path to generate an attention map \( A \). - Applying the attention map to the feature map \( V \) to generate a new feature map \( E' \). 3. **Recursive operation**: Through recursive operation, the capture of contextual information is further enhanced, so that the response of each pixel can be aggregated from all pixels in the entire HSI. 4. **Loss function**: Use the cross - entropy loss function for training, and ensure that only labeled samples participate in loss calculation through the training mask. ### Experimental results The paper conducted experiments on three widely - used HSI datasets, including Indian Pines (IP), Pavia University (PU), and Kennedy Space Center (KSC). Experimental results show that the proposed ENL - FCN achieves state - of - the - art classification performance on these datasets with lower computational costs. ### Conclusion By introducing an efficient non - local module, the paper successfully solves the problem of long - distance contextual information modeling in HSI classification and improves classification performance and computational efficiency.

Efficient Deep Learning of Non-local Features for Hyperspectral Image Classification

Efficient Deep Learning of Nonlocal Features for Hyperspectral Image Classification

A Hyperspectral Image Classification Method Based on the Nonlocal Attention Mechanism of a Multiscale Convolutional Neural Network.

Random Convolutional Network for Hyperspectral Image Classification.

Objective evaluation-based efficient learning framework for hyperspectral image classification

HSI-CNN: A Novel Convolution Neural Network for Hyperspectral Image

A Hybrid 3D–2D Feature Hierarchy CNN with Focal Loss for Hyperspectral Image Classification

A Spectral-Spatial Domain-Specific Convolutional Deep Extreme Learning Machine for Supervised Hyperspectral Image Classification.

Deep Feature Extraction and Classification of Hyperspectral Images Based on Convolutional Neural Networks

Deep Fully Convolutional Embedding Networks for Hyperspectral Images Dimensionality Reduction

Multi-Level Feature Extraction Networks for Hyperspectral Image Classification

A Fast Dynamic Graph Convolutional Network and CNN Parallel Network for Hyperspectral Image Classification

Three-dimensional Densely Connected Convolutional Network for Hyperspectral Remote Sensing Image Classification

Wide and Deep Fourier Neural Network for Hyperspectral Remote Sensing Image Classification

Spatial–Spectral Feature Extraction via Deep ConvLSTM Neural Networks for Hyperspectral Image Classification

Lightweight Spatial-Spectral Network Based on 3D-2D Multi-Group Feature Extraction Module for Hyperspectral Image Classification

Hyperspectral Image Classification Based on Multilevel Joint Feature Extraction Network

A Fast Dense Spectral-Spatial Convolution Network Framework for Hyperspectral Images Classification

DCTN: Dual-Branch Convolutional Transformer Network With Efficient Interactive Self-Attention for Hyperspectral Image Classification

Hybrid Fully Connected Tensorized Compression Network for Hyperspectral Image Classification.

Hyperspectral Image Classification Based on Dual-Scale Dense Network with Efficient Channel Attentional Feature Fusion