Abstract:The effective fusion of multisource data helps to improve the performance of land cover classification. Most existing convolutional neural network (CNN)-based methods adopt an early/late fusion strategy to fuse the low-/high-level features for classification, which still has two inherent challenges: 1) the conventional convolution operation performs a weighted average operation on each pixel in the receptive field, which will reduce the discriminability of the center pixel due to the influence of the interference pixels and 2) the spatial-spectral features of the hyperspectral image (HSI), the elevation features of light detection and ranging (LiDAR), and the complementary features between the multimodal data are not fully exploited, which results in the reduction of classification accuracy. In this article, an effective multibranch feature fusion network with self- and cross-guided attention (MB2FscgaNet) is proposed for the joint classification of LiDAR and HSI. The main concern of this article is how to accurately estimate more effective spectral-spatial-elevation features and yield more effective transfer in the network. Specifically, MB2FscgaNet adopts a multibranch feature fusion architecture to fully exploit the hierarchical features from LiDAR and HSI level by level. At each level of the network, a self- and cross-guided attention (SCGA) is developed to assign a higher weight to interesting areas and channels of LiDAR and HSI feature maps to obtain refined spectral-spatial-elevation features and provide complementary information cross-guidance between LiDAR and HS. We further designed a spectral supplement module (SeSuM) to improve the discriminative ability of the center pixel. Comparative classification results and ablation studies demonstrate that the proposed MB2FscgaNet achieves competitive performance against state-of-the-art methods.

Combined Classification of Hyperspectral and LiDAR Data Based on Dual-Channel Cross-Transformer.

Multiscale 3-D-2-D Mixed CNN and Lightweight Attention-Free Transformer for Hyperspectral and LiDAR Classification

Joint Classification of Hyperspectral Images and LiDAR Data Based on Dual-Branch Transformer

A Novel Transformer Network with a CNN-Enhanced Cross-Attention Mechanism for Hyperspectral Image Classification

Classification of hyperspectral and LiDAR data by transformer-based enhancement

Dual-Branch Feature Fusion Network Based Cross-Modal Enhanced CNN and Transformer for Hyperspectral and LiDAR Classification

Modality Fusion Vision Transformer for Hyperspectral and LiDAR Data Collaborative Classification

A Joint Convolutional Cross ViT Network for Hyperspectral and Light Detection and Ranging Fusion Classification

MCFT: Multimodal Contrastive Fusion Transformer for Classification of Hyperspectral Image and LiDAR Data

A New Multi-Level Attention Feature Fusion Method for Hyperspectral and Lidar Data Joint Classification

Joint Classification of Hyperspectral and LiDAR Data Using a Hierarchical CNN and Transformer

Attention Fusion of Transformer-Based and Scale-Based Method for Hyperspectral and LiDAR Joint Classification

A multimodal hyper-fusion transformer for remote sensing image classification

Mutually Beneficial Transformer for Multimodal Data Fusion

Multimodal Fusion Transformer for Remote Sensing Image Classification

Multiple Information Collaborative Fusion Network for Joint Classification of Hyperspectral and LiDAR Data

Cross Attention-Based Multi-Scale Convolutional Fusion Network for Hyperspectral and LiDAR Joint Classification

Deep Symmetric Fusion Transformer for Multimodal Remote Sensing Data Classification

Multibranch Feature Fusion Network with Self- and Cross-Guided Attention for Hyperspectral and LiDAR Classification

Multi-Scale Feature Fusion for Hyperspectral and Lidar Data Joint Classification

Cross-Modality Fusion Transformer for Multispectral Object Detection