D-CANet: Diverse Class-Aware Coding and Decoding Structure Network for Semantic Segmentation of High-Resolution Remote Sensing Images

Zhengwu Yuan,Wen Shao,Qiang Chen,Yingqi Ke
DOI: https://doi.org/10.1117/12.3033644
2024-01-01
Abstract:The substantial scale variation and intra-class diversity within remote sensing imagery pose significant challenges for semantic segmentation, rendering methods developed for natural images inapplicable. These challenges, we introduce a novel semantic segmentation model named D-CANet, which primarily comprises three modules: the Global Class Center Awareness (GCCA), the Local Class Awareness Module (LCAM), and the Global Class Generation Module (GCG). Specifically, the GCCA module is dedicated to modeling the global representation of class context to mitigate the interference from image backgrounds; the LCAM module generates a local class representation, serving as an intermediary perceptual element that facilitates an implicit linkage between pixels and global class representations, minimizing the variance within classes; following the processing by the LCAM module, the GCG module enhances the global class representation. This encoder-decoder structure equipped with GCCA, LCAM, and GCG modules achieves precise segmentation of objects of varying scales within remote sensing imagery through the interactive perception and fusion of global and local features. Experimental assessments conducted on the Potsdam dataset and the Vaihingen dataset illustrate that D-CANet surpasses the current state-of-the-art semantic segmentation techniques in terms of efficacy.
What problem does this paper attempt to address?