Multimodal Deep Learning for Semisupervised Classification of Hyperspectral and LiDAR Data

Chunyu Pu,Yingxu Liu,Shuai Lin,Xu Shi,Zhengying Li,Hong Huang
DOI: https://doi.org/10.1109/tbdata.2024.3433494
2024-01-01
IEEE Transactions on Big Data
Abstract:Deep learning (DL) has emerged as a competitive method in single-modality-dominated remote sensing (RS) data classification tasks, but its classification performance inevitably encounters a bottleneck due to the lack of representation diversity in complicated spatial structures with various land cover types. Therefore, the RS community has been actively researching multimodal feature learning techniques for the same scene. However, expert annotation of multisource data consumes a significant amount of time and cost. This article proposes an end-to-end method called semisupervised multimodal dual-path network (SMDN). This method simultaneously explore spatial-spectral features contained in hyperspectral images (HSI) and elevation information provided by light detection and ranging (LiDAR). SMDN exploits an unsupervised novel encoder-decoder structure as the backbone network to construct a multimodal DL architecture by jointly training with a data-specific branch. To obtain discriminative multimodal representations, SMDN is able to guide the collaborative training of two different unsupervised features mapped in the latent subspace with limited labeled training samples. Furthermore, after a simple modification of the fusion strategy in SMDN, it can be applied to unsupervised classification problems. Experimental results on benchmark RS datasets validate the effectiveness of the developed SMDN compared over many state-of-the-art methods.
What problem does this paper attempt to address?