MS2CANet: Multiscale Spatial–Spectral Cross-Modal Attention Network for Hyperspectral Image and LiDAR Classification

Xianghai Wang,Junheng Zhu,Yining Feng,Lu Wang
DOI: https://doi.org/10.1109/lgrs.2024.3350633
IF: 5.343
2024-02-02
IEEE Geoscience and Remote Sensing Letters
Abstract:The acquisition of multisource remote-sensing (RS) data has become more and more convenient due to the boom and innovation of RS imaging technology. The fusion and classification of hyperspectral images (HSIs) and Light Detection and Ranging (LiDAR) data has become a research hotspot because of their excellent complementarity and the vigorous development of deep learning (DL) provides effective methods. Most of the existing methods based on convolution neural networks (CNNs) have fixed convolution kernels, making it difficult to extract multiscale detailed features. In this letter, we propose a multiscale pyramid fusion framework based on spatial–spectral cross-modal attention (S2CA) for HSIs and LiDAR classification, which has strong multiscale information learning ability, especially in areas with complex information changes, thereby improving classification accuracy. Multiscale pyramid convolution is used to extract multiscale features, and an effective feature recalibration (EFR) module is used to enhance features and suppress useless information at each scale. To increase the interactivity of information between modes, we propose an S2CA module, which uses the features of different modes to enhance each other. Three real public datasets are used for the experiment. Compared with the existing advanced methods, the proposed method achieves the best results. The source code of the multiscale S2CA network (MS2CANet) will be public at https://github.com/junhengzhu/MS2CANet.
imaging science & photographic technology,remote sensing,engineering, electrical & electronic,geochemistry & geophysics
What problem does this paper attempt to address?