DBCTNet: Double Branch Convolution-Transformer Network for Hyperspectral Image Classification

Rui Xu,Xue-Mei Dong,Weijie Li,Jiangtao Peng,Weiwei Sun,Yi Xu

DOI: https://doi.org/10.1109/tgrs.2024.3368141

IF: 8.2

2024-03-02

IEEE Transactions on Geoscience and Remote Sensing

Abstract:Currently, deep learning (DL) methods represented by convolutional neural networks (CNNs) or Transformers are of great interest in hyperspectral image (HSI) classification. And recent works show that hybrid models using CNN and Transformer modules are expected to achieve better performance than when they are used alone. However, these hybrid models applied to HSI classification consider the combination of 2-D CNN and Transformer, which makes the models have high computational complexity. And the information of multiple spectral dimensions different from ordinary RGB images has not been fully excavated. Based on this, we propose, a double branch Convolution-Transformer network (DBCTNet). Specifically, a MSpeFE module is used for multiscale spectral feature extraction at the early stage of the proposed network. Then, a ConvTE block is designed to improve the original Transformer encoder (TE), where a Conv spectral projection unit and a convolutional multihead self-attention (CMHSA) unit are proposed to extract spatial and global spectral features. A double branch module is further built based on 3-D CNN and ConvTE. This module can fully integrate spatial and local–global spectral features, while also having low computational complexity. Experiment results on four public datasets, Pavia University, Houston, WHU-Hi-LongKou, and HuangHeKou, show that DBCTNet achieves satisfactory performance with a small number of parameters and relatively excellent efficiency compared to nine other networks. The implement of DBCTNet will be available publicly at (https://github.com/xurui-joei/DBCTNet).

imaging science & photographic technology,remote sensing,engineering, electrical & electronic,geochemistry & geophysics

What problem does this paper attempt to address?

The problems that this paper attempts to solve mainly focus on the limitations of existing methods in hyperspectral image (HSI) classification. Specifically: 1. **High computational complexity**: Existing hybrid models combine 2D convolutional neural networks (CNNs) and Transformers in HSI classification, resulting in models with high computational complexity. 2. **Under - utilization of multi - dimensional spectral information**: Unlike ordinary RGB images, HSI contains information in multiple spectral dimensions, but this information has not been fully exploited yet. 3. **Joint extraction of local and global features**: Existing methods often cannot effectively handle local and global information simultaneously when extracting spatial and spectral features, especially lacking in computational efficiency. To address these problems, the author proposes a double - branch convolution - Transformer network (DBCTNet). The main contributions of DBCTNet are as follows: 1. **Multi - scale spectral feature extraction module (MSpeFE)**: This module performs multi - scale feature extraction in the spectral dimension through convolution kernels of different sizes, enriches the spectral signal, and improves the model performance. 2. **Improved Transformer encoder (ConvTE)**: By introducing convolution operations to replace linear layers, ConvTE can extract spatial and global spectral features with fewer parameters and floating - point operations (FLOPs). 3. **Double - branch module (DBCT)**: Based on 3D CNN and ConvTE, this module can continuously model local and global representations and simultaneously extract spatial and local - global spectral features. Through these innovations, the experimental results of DBCTNet on four public datasets (Pavia University, Houston, WHU - Hi - LongKou, and HuangHeKou) show that it achieves satisfactory performance with a relatively small number of parameters and has relatively excellent efficiency.

DBCTNet: Double Branch Convolution-Transformer Network for Hyperspectral Image Classification

A Novel Transformer Network with a CNN-Enhanced Cross-Attention Mechanism for Hyperspectral Image Classification

Multiscale 3-D-2-D Mixed CNN and Lightweight Attention-Free Transformer for Hyperspectral and LiDAR Classification

DCTN: Dual-Branch Convolutional Transformer Network With Efficient Interactive Self-Attention for Hyperspectral Image Classification

A Dual-Branch Multiscale Transformer Network for Hyperspectral Image Classification

CMTNet: Convolutional Meets Transformer Network for Hyperspectral Images Classification

Dual-Branch Adaptive Convolutional Transformer for Hyperspectral Image Classification

Hyperspectral image classification using a double-branch hierarchical partial convolution network

Hyperspectral Image Classification Based on Multibranch Attention Transformer Networks

RDTN: Residual Densely Transformer Network for hyperspectral image classification

Double-branch feature fusion transformer for hyperspectral image classification

CNN and Transformer interaction network for hyperspectral image classification

Double Attention Transformer for Hyperspectral Image Classification

DCN-T: Dual Context Network with Transformer for Hyperspectral Image Classification

Hyperspectral Image Classification via Spectral Pooling and Hybrid Transformer

A Center-Masked Transformer for Hyperspectral Image Classification

Deep global-local transformer network combined with extended morphological profiles for hyperspectral image classification

DECT: Diffusion-Enhanced CNN–Transformer for Multisource Remote Sensing Data Classification

A multimodal hyper-fusion transformer for remote sensing image classification

A dual-branch siamese spatial-spectral transformer attention network for Hyperspectral Image Change Detection

Global–Local 3-D Convolutional Transformer Network for Hyperspectral Image Classification