An Attentional Spatial Temporal Graph Convolutional Network with Co-Occurrence Feature Learning for Action Recognition

Dong Tian,Zhe-Ming Lu,Xiao Chen,Long-Hua Ma
DOI: https://doi.org/10.1007/s11042-020-08611-4
IF: 2.577
2020-01-01
Multimedia Tools and Applications
Abstract:Action recognition plays a central role in intelligent surveillance system, game-control, human-computer interaction, and so on. In this work, we design a multi-task framework that improves the recent Spatial-Temporal Graph Convolutional Networks (ST-GCN) for skeleton-based action recognition by introducing the attention mechanism and co-occurrence feature learning. Specifically, we use an attentional branch to pay more attention to more discriminating features and aggregates co-occurrence features from all joints globally in another branch. Additionally, our multi-task framework exploits the inherent correlation between branches to further enhance the classification accuracy and convergence speed. Experiments have been carried out on NTURGB+D and Kinetics human action dataset. The results clearly show that the accuracy of the proposed multi-task framework are distinguishably higher than ST-GCN and other mainstream methods for 3D action recognition.
What problem does this paper attempt to address?