Lightweight Skeleton-Based Action Recognition Model Based on Global–local Feature Extraction and Fusion

Zhe Deng,Yulin Wang,Xing Wei,Fan Yang,Chong Zhao,Yang Lu
DOI: https://doi.org/10.1007/s13042-024-02347-5
2024-01-01
International Journal of Machine Learning and Cybernetics
Abstract:Skeleton-based action recognition has become a research hotspot in the field of computer vision because of its lightweight and strong anti-interference. However, there are disadvantages such as single feature extraction, limited expression ability, and low recognition accuracy. To solve these problems, we propose a lightweight Skeleton-based action recognition model based on global–local feature extraction and fusion (GLF-GCN). GLF-GCN includes a Feature extraction of non-connected nodes Module (Global-GCN), a Feature extraction of adjacent nodes Module (Local-GCN), and a Dynamic Fusion module. More specifically, Global-GCN combines one-dimensional convolution and shift operations to capture spatio-temporal dependencies across global nodes, using shift operations as a replacement for spatio-temporal graph convolution to reduce computational complexity. Meanwhile, Local-GCN captures temporal and spatial local information from first-order neighboring nodes. On this basis, Dynamic Fusion integrates global information based on joint hierarchy and local information based on body parts to discern the varying dependency relationships among different body parts and joints, improving the model’s ability to interpret different skeleton action sequences. The experimental results on single stream and multi-stream data show that the proposed model has higher accuracy, which attains the state-of-the-art performance.
What problem does this paper attempt to address?