Using Temporal Convolutional Networks to Enable Action Recognition for Construction Equipment.

Mingen Wang,Jinyue Zhang,Yuexian Hou
DOI: https://doi.org/10.1145/3447450.3447467
2020-01-01
Abstract:In order to improve the efficiency of construction and reduce or avoid accidents, it is necessary to accurately identify the activities of construction equpment. However, research on construction equipment action recognition (CEAR) is still very limited, mainly due to the lack of video datasets for algorithm development. This research developed a video dataset of 2064 clips with five action categories of excavators and dump trucks. Using this dataset, a simplified temporal convolutional network (STCN) was developed by combining convolutional neural networks (CNN) with long-short-term memory networks (LSTM), where CNN is used to extract image features from video clips and LSTM is used to extract time-series features from the frame sequences of videos. By comparing with a similar CEAR method (CNN-DLSTM) and two advanced human action recognition methods (C3D and two-stream ConvNets), the STCN method has comparable performance and requires less training time. This study also preliminarily proved that the human action recognition (HAR) methods have the potential for action recongnition of construction equipment.
What problem does this paper attempt to address?