Abstract:The use of remote sensing imagery for land cover and land use classification has made significant advancements in recent years. However, it becomes particularly challenging to enhance the semantic representation of high-resolution networks while dealing with uneven land categories and merging multi-scale data without compromising the accuracy of semantic segmentation. To tackle this challenge, this paper presents a novel method for classifying high-resolution remote sensing images based on a deep neural network that performs semantic segmentation of urban construction lands into five categories: vegetation, water, buildings, roads, and bare soil. The network incorporates a U-shaped high-resolution neural network and the advanced high-resolution network (HRNet) framework. The parallel storage of feature maps with different resolutions enables the exchange of information between them. The data pre-processing module addresses the issue of data imbalance in the semantic segmentation of urban construction lands, resulting in an increase in Intersection over Union (IoU) values for different land types by 3.75%-12.01%. Additionally, a target context representation module is introduced to enhance the feature representation of pixels by calculating the relationship between pixels and multiple target regions. Moreover, a polarization attention mechanism is proposed to extract the characteristics of geographical objects in all directions and achieve a stronger semantic representation. This method provides a novel approach to accurately and effectively extract information on construction lands and advance the development of monitoring algorithms for urban construction lands. To validate the proposed U-HRNet-OCR+PSA network, a comparative analysis was conducted with six classical networks, including DeepLabv3+, PSPNet, U-Net, U-Net++, HRNet, and HRNet-OCR, as well as the relatively new ViT-adapter-L, Oneformer and InternImage-H. The experiments demonstrate that the U-HRNet-OCR+PSA network achieves higher accuracy compared to the aforementioned networks. Specifically, the corresponding IoU values for the buildings, roads, vegetation, bare soil, and water in the multi-scale dataset are 89.79%, 90.05%, 94.89%, 85.91%, and 88.36%, respectively.

High Resolution Scene Parsing Network Based on Semantic Segmentation

Dynamic High-Resolution Network for Semantic Segmentation in Remote-Sensing Images

HRDLNet: a Semantic Segmentation Network with High Resolution Representation for Urban Street View Images

UHRSNet: A Semantic Segmentation Network Specifically for Ultra-High-Resolution Images

HRNet- and PSPNet-based multiband semantic segmentation of remote sensing images

Multi-scale Image Semantic Segmentation Based on ASPP and Improved HRNet

Multi-Branch Adaptive Hard Region Mining Network for Urban Scene Parsing of High-Resolution Remote-Sensing Images

Parsing Very High Resolution Urban Scene Images by Learning Deep ConvNets with Edge-Aware Loss

Semantic segmentation of urban land classes using a multi-scale dataset

Hi-ResNet: Edge Detail Enhancement for High-Resolution Remote Sensing Segmentation

HRCNet: High-Resolution Context Extraction Network for Semantic Segmentation of Remote Sensing Images

High Resolution Feature Recovering for Accelerating Urban Scene Parsing

Multi-Scale Context Aggregation for Semantic Segmentation of Remote Sensing Images.

HrreNet: Semantic Segmentation Network for Moderate and High-Resolution Satellite Images

Integrating Detailed Features and Global Contexts for Semantic Segmentation in Ultra-High-Resolution Remote Sensing Images

Transformer and CNN Hybrid Deep Neural Network for Semantic Segmentation of Very-High-Resolution Remote Sensing Imagery

Semantic Segmentation of Urban Buildings from VHR Remote Sensing Imagery Using a Deep Convolutional Neural Network

Semantic Relocation Parallel Network for Semantic Segmentation

HoloSeg: an Efficient Holographic Segmentation Network for Real-time Scene Parsing

Semantic Segmentation Based on Spatial Pyramid Pooling and Multilayer Feature Fusion

HCRB-MSAN: Horizontally Connected Residual Blocks-Based Multiscale Attention Network for Semantic Segmentation of Buildings in HSR Remote Sensing Images