RE-STNet: relational enhancement spatio-temporal networks based on skeleton action recognition

Hongwei Chen,Shiqi He,Zexi Chen
DOI: https://doi.org/10.1007/s11042-024-18864-y
IF: 2.577
2024-03-16
Multimedia Tools and Applications
Abstract:Learning comprehensive spatio-temporal joint connections in complex actions is crucial for recognizing skeleton sequence actions. However, existing research methods equally extract spatio-temporal features without focusing on critical joint connections, and failing to provide effective complementary information on the acquired joint features. Additionally, using a single-level topology restricts the exploration of global node relationships, leading to potential loss of implicit node correlations that can impact model fusion. To address these challenges, this study introduces the Relational Enhancement Spatio-Temporal Networks (RE-STNet). RE-STNet employs a complementary relationship graph convolution method to capture crucial joint connections and corresponding positional information within the region. The joint cross-connection module captures the global receptive field of the current pose. Furthermore, since there will be a lot of invalid information in the action sequence, this paper proposes a temporal incentive module to capture the salient temporal frame information and combines it with a multi-scale temporal convolution module to enrich the temporal features. The resulting architecture RE-STNet is evaluated through experiments across three skeleton datasets, achieving an accuracy of 92.2% in the NTU RGB+D 60 cross-subject split, 88.6% in the NTU RGB+D 120 cross-subject split, and 95.5% in NW-UCLA. The experimental results demonstrate that our model enables the learning of more comprehensive spatial-temporal joint information.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering
What problem does this paper attempt to address?