A construction waste landfill dataset of two districts in Beijing, China from high resolution satellite images

Shaofu Lin,Lei Huang,Xiliang Liu,Guihong Chen,Zhe Fu
DOI: https://doi.org/10.1038/s41597-024-03240-0
2024-04-17
Scientific Data
Abstract:Construction waste is unavoidable in the process of urban development, causing serious environmental pollution. Accurate assessment of municipal construction waste generation requires building construction waste identification models using deep learning technology. However, this process requires high-quality public datasets for model training and validation. This study utilizes Google Earth and GF-2 images as the data source to construct a specific dataset of construction waste landfills in the Changping and Daxing districts of Beijing, China. This dataset contains 3,653 samples of the original image areas and provides mask-labeled images in the semantic segmentation domains. Each pixel within a construction waste landfill is classified into 4 categories of the image areas, including background area, vacant landfillable area, engineering facility area, and waste dumping area. The dataset contains 237,115,531 pixels of construction waste and 49,724,513 pixels of engineering facilities. The pixel-level semantic segmentation labels are provided to quantify the construction waste yield, which can serve as the basic data for construction waste extraction and yield estimation both for academic and industrial research.
multidisciplinary sciences
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the identification and evaluation of construction waste landfills in the process of urbanization. Specifically, since construction waste contains harmful substances, direct dumping will cause serious pollution to the environment. Therefore, it is necessary to accurately estimate the amount of construction waste generated to measure the cost of the urban renewal process. Traditional construction waste identification methods mainly rely on manual on - site investigations and machine - learning - based methods, which have limitations such as being time - consuming, labor - intensive, low - efficiency, and difficult to handle specific types of waste. To solve these problems, this study uses high - resolution satellite images (such as Google Earth and GF - 2 images) to construct a specific dataset for training and validating deep - learning models, thereby achieving automatic identification and quantification of construction waste landfills. This dataset contains 3,653 original image samples in Changping District and Daxing District of Beijing, and provides masked - labeled images in the field of semantic segmentation. Each pixel is classified into four categories: background area, open - space fillable area, engineering - facility area, and waste - dumping area. By providing pixel - level semantic - segmentation labels, this dataset can quantify the output of construction waste and provide basic data for academic and industrial research. This not only saves human and material resources, improves work efficiency, shortens the information - extraction cycle, but also enables more accurate monitoring and management of construction waste.