Reidentification-Based Automated Matching for 3D Localization of Workers in Construction Sites
Qilin Zhang,Zhichen Wang,Bin Yang,Ke Lei,Binghan Zhang,Boda Liu,Qilin Zhang; Zhichen Wang; Bin Yang; Ke Lei; Binghan Zhang; and Boda Liu1Professor,College of Civil Engineering,Tongji Univ.,Shanghai 200092,PR China. Email: 2Master's Candidate,College of Civil Engineering,Tongji Univ.,Shanghai 200092,PR China. Email: wangzhichen@tongji.edu.cn3Associate Professor,College of Civil Engineering,Tongji Univ.,Shanghai 200092,PR China (corresponding author). ORCID: https://orcid.org/0000-0001-7175-8001. Email: yangbin@tongji.edu.cn4Senior Engineer,China Construction Eighth Engineering Division Corp. Ltd.,No. 1568,Century Ave.,Pudong District,Shanghai 200135,China. Email: leike0307@163.com5Ph.D. Candidate,College of Civil Engineering,Tongji Univ.,Shanghai 200092,PR China. Email: zhangbinghan@tongji.edu.cn6Ph.D. Candidate,College of Civil Engineering,Tongji Univ.,Shanghai 200092,PR China. ORCID: https://orcid.org/0000-0001-6860-6082. Email: 1832553@tongji.edu.cn
DOI: https://doi.org/10.1061/(asce)cp.1943-5487.0000975
IF: 5.802
2021-08-11
Journal of Computing in Civil Engineering
Abstract:The location information of entities in construction sites, such as workers and construction machines, is valuable in project management and safety. Therefore, as nonintrusive and accurate solutions, various vision-based methods have been proposed to track entities in construction sites and obtain their three-dimensional (3D) coordinates. However, most existing vision-based methods realize 3D localizations by basing entity matching on the epipolar line, which brings instability in entity matching due to the calculation error of the epipolar line or failure to match entities when multiple entities are located on the same epipolar. To solve this problem, a novel framework based on reidentification is proposed to automatically match workers across two camera views, thereby obtaining their 3D coordinates in construction sites. In this framework, deep-learning-based computer vision algorithms are firstly used to detect and track workers in two camera views. Then, the reidentification (ReID) algorithm is applied to utilize tracked workers' visual features to match the workers across both two camera views and different frames. As a result, for every matched pair, the worker's pixel locations in two camera views can be obtained to calculate the 3D coordinates through triangulation. The implementation of videos recorded from a real construction project proves the feasibility and accuracy of this framework. Specifically, through utilizing the ReID algorithm to match workers, the framework achieves competitive results on workers matching with precision, recall, and accuracy of more than 99%, 93%, and 93%. Furthermore, it also effectively addresses the practical problems of ID repetition and ID switching. Meanwhile, this paper extends the application scenarios of reidentification algorithms in construction sites, thereby contributing to the future application of multiple-camera vision-based methods in construction sites.
computer science, interdisciplinary applications,engineering, civil