Multi-modal remote sensing image segmentation based on attention-driven dual-branch encoding framework

Zhiyang Li,Xuran Pan,Kexing Xu,Xinqi Yang
DOI: https://doi.org/10.1117/1.jrs.18.026506
IF: 1.568
2024-06-23
Journal of Applied Remote Sensing
Abstract:The high resolution remote sensing images are characterized by rich surface details and diverse features, and the single-modality high-resolution images suffer from limited expressive ability in the earth object segmentation application scenarios. We propose a multi-modal remote sensing image segmentation method based on attention-driven dual-branch encoding framework. The method involves parallel encoding of multi-modal remote sensing data to thoroughly extract features from each modality. Furthermore, multistage multi-modal features are fused by attention-driven feature fusion modules to generate high-quality multi-modal feature representation. Extensive experiments are carried out on the International Society for Photogrammetry and Remote Sensing Vaihingen and Potsdam 2D semantic labeling datasets. The datasets include both RGB/IRRG images and digital surface model (DSM) images. Experimental results demonstrate that: (1) the elevation information of DSM images can bring obvious benefits to the earth objects with significant heights, and introducing DSM images properly can improve the segmentation accuracy compared to using only RGB/IRRG images; (2) the attention-driven feature fusion module outperforms traditional feature fusion methods in capturing cross-modal complementary features, leading to outstanding segmentation accuracy for each earth object.
environmental sciences,imaging science & photographic technology,remote sensing
What problem does this paper attempt to address?