Multiple-input Streams Attention (MISA) Network for Skeleton-Based Construction Workers' Action Recognition Using Body-Segment Representation Strategies

Yuanyuan Tian,Jiayu Chen,Jung In Kim,Jungsuk Kwac
DOI: https://doi.org/10.1016/j.autcon.2023.105104
IF: 10.3
2023-01-01
Automation in Construction
Abstract:With the rapid growth of deep learning algorithms, graph convolutional networks (GCNs) have become a common choice for skeleton-based human action recognition, boasting impressive performance. However, existing GCN-based models often rely on physical human body connections, which may not suit complex construction tasks involving various body parts and hand movements. To address this concern, the human body is modeled in this paper through topological graphs at varying levels, designed based on body-segment strategies. A multiple-input streams attention (MISA) network is introduced, incorporating GCN and temporal convolutional network (TCN) components to enhance the body-structure topology graph of GCNs with more comprehensive input graphs. Additionally, two-modality motion data and three attention blocks are integrated to capture more discerning features. Finally, experimental results using the Construction Motion Library (CML) dataset demonstrated the superiority of the developed method, reaching approximately 84.94% recognition accuracy.
What problem does this paper attempt to address?