Learning transferable cross-modality representations for few-shot hyperspectral and LiDAR collaborative classification
Mofan Dai,Shuai Xing,Qing Xu,Hanyun Wang,Pengcheng Li,Yifan Sun,Jiechen Pan,Yuqiong Li
DOI: https://doi.org/10.1016/j.jag.2023.103640
IF: 7.5
2024-01-01
International Journal of Applied Earth Observation and Geoinformation
Abstract:Hyperspectral image (HSI) classification, incorporating both spatial and spectral information, is a crucial topic in earth observation and land cover analysis. However, ground objects with similar spectral attributes are still the challenges for finer classifications. Recently, deep learning-based multimodality fusion provides promising solutions by exploiting LiDAR data with its geometric information to fuse with spectral attributes. However, the labor-intensive and time-consuming multimodality data annotation limits the performance of supervised deep learning technologies. How to address the semantic disparity between the LiDAR data and HSIs, and learning transferable representations for cross-scene classifications are still challenging. In this paper, we propose a multimodal fusion relational network with meta-learning (MFRN-ML) to solve these challenges. Specifically, the MFRN-ML incorporates the multimodal learning and few-shot learning (FSL) into a three-stage task-based learning framework to learn the transferable cross-modality representations for few-shot HSI and LiDAR collaborative classification. First, a multimodal fusion relational network, composed of a cross-modality feature fusion module and a relation learning module, is proposed to address the challenge of limited annotations in multimodal learning in a data-adaptive way. Then, a three-stage task-based learning framework can train the network to learn transferable representations with few labeled samples for cross-scene classification. We perform experiments on four multimodal datasets collected by different sensors. Compared with existing supervised, semi-supervised, and meta-learning methods, MFRN-ML attains state-of-the-art performances in few-shot tasks. Particularly, our method shows promising generalization ability on unseen categories across different domains.
remote sensing