Abstract:Deep learning-based object detection methods are utilized for safety management at construction sites, which require large-scale, high-quality, and well-labeled datasets for training. The existing construction datasets are relatively small due to the high expense of labor-intensive annotation, and the varying quality of the construction images also affects the detection performance of the model. To address the limitations of datasets, this study proposes a new method for construction object detection by integrating super-resolution and semi-supervised learning. The proposed method improves the quality of construction images and achieves excellent detection performance with limited labeled data. First, the Real-ESRGAN model is introduced to improve the quality of construction images and make the construction objects visible. The proposed super-resolution method can enhance the texture details of low-resolution images, hence improving the performance of object detection models. Second, the mean-teacher network is adopted to expand the training set, thus avoiding the labor-intensive annotation work. To verify the effectiveness of the proposed method, the method is applied to the state-of-the-art Yolov5 object detection model, and construction images from the Site Object Detection Dataset (SODA) with different labeled data proportions (from 10% to 50% in 10% intervals with an extreme case of 5%) are used as the training set. By comparing with the existing supervised learning method, it is shown that the proposed method can achieve better detection performance. In particular, the method is more effective in enhancing detection performance when the proportion of the labeled data is smaller, which is of great practical value in real-world engineering. The experimental results show the potential of the proposed method in improving image quality and reducing the expense of developing construction datasets.

Synthesizing High-Quality Construction Segmentation Datasets Through Pre-trained Diffusion Model

Semi-supervised learning approach for construction object detection by integrating super-resolution and mean teacher network

DiffuMask: Synthesizing Images with Pixel-level Annotations for Semantic Segmentation Using Diffusion Models

MaskDiffusion: Exploiting Pre-trained Diffusion Models for Semantic Segmentation

DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models

Diffuse, Attend, and Segment: Unsupervised Zero-Shot Segmentation using Stable Diffusion

Deep semantic segmentation for visual understanding on construction sites

Denoising Diffusion Semantic Segmentation with Mask Prior Modeling

Diffusion Features to Bridge Domain Gap for Semantic Segmentation

DiffusionSeg: Adapting Diffusion Towards Unsupervised Object Discovery.

Dataset Diffusion: Diffusion-based Synthetic Dataset Generation for Pixel-Level Semantic Segmentation

Towards Automatic Construction of Diverse, High-Quality Image Datasets

Exploring Limits of Diffusion-Synthetic Training with Weakly Supervised Semantic Segmentation

A Multi-Objective Semantic Segmentation Algorithm Based on Improved U-Net Networks

SatSynth: Augmenting Image-Mask Pairs through Diffusion Models for Aerial Semantic Segmentation

P-MSDiff: Parallel Multi-Scale Diffusion for Remote Sensing Image Segmentation

BIM-driven Data Augmentation Method for Semantic Segmentation in Superpoint-Based Deep Learning Network

Image Augmentation with Controlled Diffusion for Weakly-Supervised Semantic Segmentation

FireDM: A weakly-supervised approach for massive generation of multi-scale and multi-scene fire segmentation datasets

Open-vocabulary Object Segmentation with Diffusion Models