Grouped Bidirectional LSTM Network and Multi-Stage Fusion Convolutional Transformer for Hyperspectral Image Classification

Qin Xu,Chao Yang,Jin Tang,Bin Luo
DOI: https://doi.org/10.1109/tgrs.2022.3207294
IF: 8.2
2022-01-01
IEEE Transactions on Geoscience and Remote Sensing
Abstract:The efficiently and effectively discriminative spectral-spatial feature representation is essential for hyperspectral image (HSI) classification. However, most of the existing methods rely on patch-based convolutional neural networks (CNNs) whose ability to extract the global spatial information is very limited. To address this issue, in this article, we propose a two-branch network consisting of a grouped bidirectional long short-term memory (GBiLSTM) network and multistage fusion convolutional transformer (MFCT) for HSI classification. In the proposed GBiLSTM-MFCT, to extract the spectral features of HSI efficiently, a GBiLSTM network is designed by dividing the sequence features and hidden units of the BiLSTM network into several separate groups. To simultaneously extract the global and local spatial features of HSI, an MFCT is proposed by fusing the features of different levels obtained from the multiple phases of the convolutional vision transformer (CVT). Moreover, in the multiheaded attention module of each stage, the blueprint separable convolution-based self-attention (BSCA) module is designed, which is able to model the global and local spatial information effectively. The outputs of the GBiLSTM network and MFCT are fused to generate discriminative and robust spectral-spatial features for HSI classification. Experiments on three benchmark datasets of Indian Pine (IN), University of Pavia (UP), and Kennedy Space Center (KSC) demonstrate that the proposed GBiLSTM-MFCT exhibits higher classification performance with very limited labeled samples than eight state-of-the-art methods.
What problem does this paper attempt to address?