Abstract:The use of remote sensing imagery for land cover and land use classification has made significant advancements in recent years. However, it becomes particularly challenging to enhance the semantic representation of high-resolution networks while dealing with uneven land categories and merging multi-scale data without compromising the accuracy of semantic segmentation. To tackle this challenge, this paper presents a novel method for classifying high-resolution remote sensing images based on a deep neural network that performs semantic segmentation of urban construction lands into five categories: vegetation, water, buildings, roads, and bare soil. The network incorporates a U-shaped high-resolution neural network and the advanced high-resolution network (HRNet) framework. The parallel storage of feature maps with different resolutions enables the exchange of information between them. The data pre-processing module addresses the issue of data imbalance in the semantic segmentation of urban construction lands, resulting in an increase in Intersection over Union (IoU) values for different land types by 3.75%-12.01%. Additionally, a target context representation module is introduced to enhance the feature representation of pixels by calculating the relationship between pixels and multiple target regions. Moreover, a polarization attention mechanism is proposed to extract the characteristics of geographical objects in all directions and achieve a stronger semantic representation. This method provides a novel approach to accurately and effectively extract information on construction lands and advance the development of monitoring algorithms for urban construction lands. To validate the proposed U-HRNet-OCR+PSA network, a comparative analysis was conducted with six classical networks, including DeepLabv3+, PSPNet, U-Net, U-Net++, HRNet, and HRNet-OCR, as well as the relatively new ViT-adapter-L, Oneformer and InternImage-H. The experiments demonstrate that the U-HRNet-OCR+PSA network achieves higher accuracy compared to the aforementioned networks. Specifically, the corresponding IoU values for the buildings, roads, vegetation, bare soil, and water in the multi-scale dataset are 89.79%, 90.05%, 94.89%, 85.91%, and 88.36%, respectively.

Deep semantic segmentation for visual understanding on construction sites

Semi-supervised learning approach for construction object detection by integrating super-resolution and mean teacher network

Semi-Supervised Learning for Visual Bird's Eye View Semantic Segmentation

Vision-based method for semantic information extraction in construction by integrating deep learning object detection and image captioning

An Improved Deeplabv3+ Model for Semantic Segmentation of Urban Environments Targeting Autonomous Driving

Deep Learning on Construction Sites: A Case Study of Sparse Data Learning Techniques for Rebar Segmentation

Content-Based Image Retrieval for Construction Site Images: Leveraging Deep Learning–Based Object Detection

Construction Scene Parsing (CSP): Structured Annotations of Image Segmentation for Construction Semantic Understanding

Semantic 3D reconstruction-oriented image dataset for building component segmentation

Semantic Segmentation for Urban-Scene Images

Semantic segmentation of urban land classes using a multi-scale dataset

Construction Instance Segmentation (CIS) Dataset for Deep Learning-Based Computer Vision

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges

Incorporating DeepLabv3+and Object-Based Image Analysis for Semantic Segmentation of Very High Resolution Remote Sensing Images

A construction method of a large-scale physical rendering 3D semantic segmentation dataset

A Method for Extracting Building Information from Remote Sensing Images Based on Deep Learning

Deep-learning-based visual data analytics for smart construction management

Segmentation of Building Footprints with Xception and IoUloss

Scene restoration from scaffold occlusion using deep learning-based methods

Towards Urban Scene Semantic Segmentation with Deep Learning from LiDAR Point Clouds: A Case Study in Baden-Württemberg, Germany