Abstract:As spatial resolution increases, the information conveyed by remote sensing images becomes more and more complex. Large-scale variation and highly discrete distribution of objects greatly increase the challenge of the semantic segmentation task for remote sensing images. Mainstream approaches usually use implicit attention mechanisms or transformer modules to achieve global context for good results. However, these approaches fail to explicitly extract intraobject consistency and interobject saliency features leading to unclear boundaries and incomplete structures. In this article, we propose a category-guided global–local feature interaction network (CGGLNet), which utilizes category information to guide the modeling of global contextual information. To better acquire global information, we proposed a category-guided supervised transformer module (CGSTM). This module guides the modeling of global contextual information by estimating the potential class information of pixels so that features of the same class are more aggregated and those of different classes are more easily distinguished. To enhance the representation of local detailed features of multiscale objects, we designed the adaptive local feature extraction module (ALFEM). By parallel connection of the CGSTM and the ALFEM, our network can extract rich global and local context information contained in the image. Meanwhile, the designed feature refinement segmentation head (FRSH) helps to reduce the semantic difference between deep and shallow features and realizes the full integration of different levels of information. Extensive ablation and comparison experiments on two public remote sensing datasets (ISPRS Vaihingen dataset and ISPRS Potsdam dataset) indicate that our proposed CGGLNet achieves superior performance compared to the state-of-the-art methods.

Global Aggregation then Local Distribution for Scene Parsing

Global Aggregation then Local Distribution in Fully Convolutional Networks

DAR-Net: Dynamic Aggregation Network for Semantic Scene Segmentation

An Adaptive Post-Processing Network with the Global-Local Aggregation for Semantic Segmentation

Spatially-Aware Context Neural Networks.

Optimizing Spatial Relationships in GCN to Improve the Classification Accuracy of Remote Sensing Images

Global Context Dependencies Aware Network for Efficient Semantic Segmentation of Fine-Resolution Remoted Sensing Images

A Large-Scale Point Cloud Semantic Segmentation Network Via Local Dual Features and Global Correlations

CGGLNet: Semantic Segmentation Network for Remote Sensing Images Based on Category-Guided Global–Local Feature Interaction

Non-Local Aggregation for RGB-D Semantic Segmentation

Self-constructing graph neural networks to model long-range pixel dependencies for semantic segmentation of remote sensing images

LDCNet: Long-Distance Context Modeling for Large-Scale 3D Point Cloud Scene Semantic Segmentation

SALA: Soft Assignment Local Aggregation for Parameter Efficient 3D Semantic Segmentation

Attention Guided Global Enhancement and Local Refinement Network for Semantic Segmentation

Real-time Semantic Segmentation with Context Aggregation Network

LACTNet: A Lightweight Real-Time Semantic Segmentation Network Based on an Aggregated Convolutional Neural Network and Transformer

High-Resolution Remote Sensing Image Segmentation With Global-Guided Normalization and Local Affinity Distillation

GA-NET: Global Attention Network for Point Cloud Semantic Segmentation

NeiEA-NET: Semantic segmentation of large-scale point cloud scene via neighbor enhancement and aggregation

GLSNet++: Global and Local-Stream Feature Fusion for LiDAR Point Cloud Semantic Segmentation Using GNN Demixing Block

Attention-guided chained context aggregation for semantic segmentation