Hyperspectral Image Classification Based on Two-Branch Multiscale Spatial Spectral Feature Fusion with Self-Attention Mechanisms

Boran Ma,Liguo Wang,Heng Wang
DOI: https://doi.org/10.3390/rs16111888
IF: 5
2024-05-25
Remote Sensing
Abstract:In recent years, the use of deep neural network in effective network feature extraction and the design of efficient and high-precision hyperspectral image classification algorithms has gradually become a research hotspot for scholars. However, due to the difficulty of obtaining hyperspectral images and the high cost of annotation, the training samples are very limited. In order to cope with the small sample problem, researchers often deepen the network model and use the attention mechanism to extract features; however, as the network model continues to deepen, the gradient disappears, the feature extraction ability is insufficient, and the computational cost is high. Therefore, how to make full use of the spectral and spatial information in limited samples has gradually become a difficult problem. In order to cope with such problems, this paper proposes two-branch multiscale spatial–spectral feature aggregation with a self-attention mechanism for a hyperspectral image classification model (FHDANet); the model constructs a dense two-branch pyramid structure, which can achieve the high efficiency extraction of joint spatial–spectral feature information and spectral feature information, reduce feature loss to a large extent, and strengthen the model's ability to extract contextual information. A channel–space attention module, ECBAM, is proposed, which greatly improves the extraction ability of the model for salient features, and a spatial information extraction module based on the deep feature fusion strategy HLDFF is proposed, which fully strengthens feature reusability and mitigates the feature loss problem brought about by the deepening of the model. Compared with five hyperspectral image classification algorithms, SVM, SSRN, A2S2K-ResNet, HyBridSN, SSDGL, RSSGL and LANet, this method significantly improves the classification performance on four representative datasets. Experiments have demonstrated that FHDANet can better extract and utilise the spatial and spectral information in hyperspectral images with excellent classification performance under small sample conditions.
environmental sciences,imaging science & photographic technology,remote sensing,geosciences, multidisciplinary
What problem does this paper attempt to address?
The paper primarily addresses several key challenges in Hyperspectral Image Classification (HIC) and proposes a new classification model. Specifically, the study aims to solve the following issues: 1. **Small Sample Problem**: Due to the high cost of acquiring hyperspectral images and their annotated data, the number of training samples is usually very limited. How to fully utilize the spectral and spatial information in these limited samples under such conditions becomes a challenge. 2. **Insufficient Feature Extraction Capability**: As the network model deepens, although more complex features can be extracted, it also leads to problems such as gradient vanishing and feature loss, which limits the model's feature extraction capability and computational efficiency. 3. **Feature Loss Problem**: Even with traditional methods like residual connections and dense connections to alleviate the above issues, the effect is still not ideal. This is because the model does not pay equal attention to different features during training, thus requiring the introduction of an attention mechanism to improve the model's ability to extract significant features. To address these issues, the authors propose a high-precision hyperspectral image classification model (FHDANet) that combines dual multi-scale feature fusion and self-attention mechanisms. The model addresses the above problems through the following three main contributions: 1. **Dual Multi-Scale Feature Aggregation**: Utilizing a dense pyramid structure to extract multi-scale spatial-spectral feature information, and extracting these feature information through joint spatial-spectral branches and spectral branches respectively, thereby obtaining hyperspectral feature maps. 2. **Efficient Channel-Spatial Block Attention Module (ECBAM)**: During the spatial feature extraction process, ECBAM is proposed to enhance the model's ability to extract significant features, effectively allocate computational resources, and reduce the impact of the background. 3. **High-Low Feature Fusion Strategy (HLDFF)**: During the spatial feature extraction process, a high-low feature fusion strategy is proposed. By deconvolution upsampling of high-level feature maps and fusing them with low-level feature maps, richer feature representations are obtained. In summary, the goal of this study is to improve the performance of hyperspectral image classification under small sample conditions by designing a classification model that can effectively extract and utilize the spatial and spectral information in hyperspectral images.