Abstract:Skeleton-based action recognition methods using complete human skeletons have achieved remarkable performance, but the performance of these methods could significantly deteriorate when critical joints or frames of the skeleton sequence are occluded or disrupted. However, the acquisition of incomplete and noisy human skeletons is inevitable in realistic environments. In order to strengthen the robustness of action recognition model, we propose an I mproved S patial T emporal G raph C onvolutional N etwork ( IST-GCN ) model, including three modules, namely Multi-dimension Adaptive Graph Convolutional Network (Md-AGCN), Enhanced Attention Mechanism (EAM) and Multi-Scale Temporal Convolutional Network (MS-TCN). Specifically, the Md-AGCN module can first adaptively adjust the graph structure according to different layers and the spatial dimension, temporal dimension, and channel dimension of different action samples to establish corresponding connections for long-range joints with dependencies. Then, the EAM module can focus on important information based on spatial domain, temporal domain and channel to further strengthen the dependencies between important joints. Finally, the MS-TCN module is used to enlarge the receptive field to extract more latent temporal dependencies. The comprehensive experiments on NTU-RGB+D and NTU-RGB+D 120 datasets demonstrate that our approach possesses outstanding performance in terms of both accuracy and robustness when skeleton samples are incomplete and noisy compared with the state-of-the-art (SOTA) approach. Moreover, the parameters and computational complexity of our model are far less than those of the existing approaches.

Spatio-Temporal Attention Deep Network for Skeleton Based View-Invariant Human Action Recognition

An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data

View Adaptive Recurrent Neural Networks for High Performance Human Action Recognition from Skeleton Data

A Novel View Attention Network for Skeleton Based Human Action Recognition*

Spatial Temporal Graph Attention Network for Skeleton-Based Action Recognition

Spatio-Temporal Attention-Based LSTM Networks for 3D Action Recognition and Detection

View Adaptive Neural Networks for High Performance Skeleton-based Human Action Recognition

Shifting Perspective to See Difference: A Novel Multi-View Method for Skeleton Based Action Recognition

An Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition

Skeleton-Based Human Action Recognition with Global Context-Aware Attention LSTM Networks

Decoupled Spatial-Temporal Attention Network for Skeleton-Based Action Recognition

Skeleton-Based Human Action Recognition Using Spatial Temporal 3D Convolutional Neural Networks

Deep spatiotemporal LSTM network with temporal pattern feature for 3D human action recognition

Self-Attention Network for Skeleton-based Human Action Recognition

An improved spatial temporal graph convolutional network for robust skeleton-based action recognition

Representation Learning of Temporal Dynamics for Skeleton-Based Action Recognition

Enhanced Skeleton Visualization for View Invariant Human Action Recognition.

Multiple temporal scale aggregation graph convolutional network for skeleton-based action recognition

Spatial Temporal Transformer Network for Skeleton-based Action Recognition

Skeleton-based action recognition with hierarchical spatial reasoning and temporal stack learning network

Hierarchical recurrent neural network for skeleton based action recognition