Boosting Hyperspectral Image Classification with Gate-Shift-Fuse Mechanisms in a Novel CNN-Transformer Approach

Mohamed Fadhlallah Guerri,Cosimo Distante,Paolo Spagnolo,Fares Bougourzi,Abdelmalik Taleb-Ahmed
2024-10-03
Abstract:During the process of classifying Hyperspectral Image (HSI), every pixel sample is categorized under a land-cover type. CNN-based techniques for HSI classification have notably advanced the field by their adept feature representation capabilities. However, acquiring deep features remains a challenge for these CNN-based methods. In contrast, transformer models are adept at extracting high-level semantic features, offering a complementary strength. This paper's main contribution is the introduction of an HSI classification model that includes two convolutional blocks, a Gate-Shift-Fuse (GSF) block and a transformer block. This model leverages the strengths of CNNs in local feature extraction and transformers in long-range context modelling. The GSF block is designed to strengthen the extraction of local and global spatial-spectral features. An effective attention mechanism module is also proposed to enhance the extraction of information from HSI cubes. The proposed method is evaluated on four well-known datasets (the Indian Pines, Pavia University, WHU-WHU-Hi-LongKou and WHU-Hi-HanChuan), demonstrating that the proposed framework achieves superior results compared to other models.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to address the challenges in hyperspectral image (HSI) classification. Specifically, the author attempts to improve the performance of HSI classification by combining the advantages of convolutional neural network (CNN) and Transformer models. The following are the main problems that the paper attempts to solve: 1. **Challenges in deep feature extraction**: - CNN performs well in HSI classification, but still faces challenges in obtaining deep - level features. - The Transformer model is good at extracting high - level semantic features, but may not be able to fully utilize local spatial information when used alone. 2. **Fusion of local and global features**: - The proposed model needs to effectively fuse local spatial features and global context information to improve classification accuracy. 3. **Improvement of the attention mechanism**: - Existing methods do not fully extract local and global information when dealing with hyperspectral data. - It is necessary to design a more effective attention mechanism module to enhance the ability to extract information from the HSI cube. 4. **Dealing with high - dimensional data and noise**: - HSI data has high - dimensional characteristics and is easily affected by noise. - The model needs to have good robustness and be able to maintain high accuracy in a complex data environment. ### Main contributions of the paper To address the above challenges, the paper makes the following main contributions: - **Introduction of Gate - Shift - Fuse (GSF) blocks**: Specially designed for HSI classification, it can enhance the ability of CNN and Transformer in extracting local and global spatial - spectral features. - **Proposing an effective attention mechanism module**: Enabling the model to better extract local and global information from the HSI cube. - **Experimental verification**: Evaluated on four well - known datasets (Indian Pines, Pavia University, WHU - WHU - Hi - LongKou, and WHU - Hi - HanChuan), the results show that the proposed framework outperforms other models. Through these improvements, the paper aims to provide a more efficient and accurate method for HSI classification, especially having broad application prospects in fields such as agriculture, biomedical imaging, mineral exploration, food safety, and military reconnaissance.