STGL-GCN: Spatial–Temporal Mixing of Global and Local Self-Attention Graph Convolutional Networks for Human Action Recognition

Zhenggui Xie,Gengzhong Zheng,Liming Miao,Wei Huang
DOI: https://doi.org/10.1109/access.2023.3246127
IF: 3.9
2023-01-01
IEEE Access
Abstract:Human action recognition methods based on skeleton data have been widely studied owing to their strong robustness to illumination and complex backgrounds. Existing methods have achieved good recognition results; however, they have certain challenges, such as the fixed topological structure of the graph, the omission of nonphysical joint correlation, and the inability to extract local spatial-temporal features. Herein, we propose spatial-temporal mixing of global and local self-attention graph convolutional networks (STGL-GCN) using skeleton data. The global self-attention matrix captures the potential dependencies of nonphysical correlations between joints, and the local self-attention matrix determines the connection strength of the physical edges of joints. The matrices are updated together with the convolution parameters in each network layer as the model is trained for optimal graph structure to achieve accurate action expressions Experiments on the NTU-RGBD dataset demonstrate that our model accurately recognizes actions.
What problem does this paper attempt to address?