Abstract:Land use and land cover maps provide fundamental information that has been used in different types of studies, ranging from public health to carbon cycling. However, the existing remote sensing image classification methods thus far suffer from the insufficient usage of multiple modalities, underconsideration of prior domain knowledge, and poor performance on minority classes. To alleviate these problems, we propose a novel domain knowledge-guided deep collaborative fusion network (DKDFN) with performance boosting for minority categories for land cover classification. More specifically, the DKDFN adopts a multihead encoder and a multibranch decoder structure. The architecture of the encoder probablizes sufficient mining of complementary information from multiple modalities, which are Sentinel-2, Sentinel-1, and SRTM Digital Elevation Data (SRTM) in our case. The multibranch decoder enables land cover classification in a multitask learning setup, performing semantic segmentation and reconstructing multimodal remote sensing indices, which are selected as representatives of domain knowledge. This design incorporates domain knowledge in an effective end-to-end manner. The training stage of our DKDFN is supervised by our proposed asymmetry loss function (ALF), which boosts performance on nearly all categories, especially the categories with a low frequency of occurrence. Ablation studies of the network suggest that our design logic is worth testing in any network with an encoder-decoder structure. The study is conducted in Hunan, China and is verified using a self-labeled multimodal unitemporal remote sensing image dataset. The comparative experiments between DKDFN and 6 state-of-the-art models (U-Net, SegNet, PSPNet, DeepLab, HRNet, MP-ResNet) testify to the superiority of our method and suggest its potential to be applied more widely to map land cover in other geographical areas given the availability of Sentinel-2, Sentinel-1, and SRTM data. The dataset can be downloaded by https://github.com/LauraChow/HunanMultimodalDataset.

A Unified Multimodal Deep Learning Framework for Remote Sensing Imagery Classification.

More Diverse Means Better: Multimodal Deep Learning Meets Remote Sensing Imagery Classification

Convolutional Neural Networks for Multimodal Remote Sensing Data Classification

A Multi-Modal Unified Representation Learning Framework with Masked Image Modeling for Remote Sensing Images

Transfer Representation Learning Meets Multimodal Fusion Classification for Remote Sensing Images

A unified multimodal classification framework based on deep metric learning

Deep Symmetric Fusion Transformer for Multimodal Remote Sensing Data Classification

Ensemble of Deep Learning-Based Multimodal Remote Sensing Image Classification Model on Unmanned Aerial Vehicle Networks

DKDFN: Domain Knowledge-Guided deep collaborative fusion network for multimodal unitemporal remote sensing land cover classification

Deep Learning in Multimodal Remote Sensing Data Fusion: A Comprehensive Review

Remote Sensing Collaborative Classification Using Multimodal Adaptive Modulation Network

TSCMDL: Multimodal Deep Learning Framework for Classifying Tree Species Using Fusion of 2-D and 3-D Features

Deep Multi-Feature Fusion Network for Remote Sensing Images

A Unified Multiscale Learning Framework for Hyperspectral Image Classification

Multiple Information Collaborative Fusion Network for Joint Classification of Hyperspectral and LiDAR Data

A Collaborative Correlation-Matching Network for Multimodality Remote Sensing Image Classification.

Dual-Branch Dynamic Modulation Network for Hyperspectral and LiDAR Data Classification.

Multimodal Deep Learning for Semisupervised Classification of Hyperspectral and LiDAR Data

Multimodal Remote Sensing Data Classification Based on Gaussian Mixture Variational Dynamic Fusion Network

Deep Multiview Union Learning Network for Multisource Image Classification.