Abstract:Skeleton-based action recognition methods using complete human skeletons have achieved remarkable performance, but the performance of these methods could significantly deteriorate when critical joints or frames of the skeleton sequence are occluded or disrupted. However, the acquisition of incomplete and noisy human skeletons is inevitable in realistic environments. In order to strengthen the robustness of action recognition model, we propose an I mproved S patial T emporal G raph C onvolutional N etwork ( IST-GCN ) model, including three modules, namely Multi-dimension Adaptive Graph Convolutional Network (Md-AGCN), Enhanced Attention Mechanism (EAM) and Multi-Scale Temporal Convolutional Network (MS-TCN). Specifically, the Md-AGCN module can first adaptively adjust the graph structure according to different layers and the spatial dimension, temporal dimension, and channel dimension of different action samples to establish corresponding connections for long-range joints with dependencies. Then, the EAM module can focus on important information based on spatial domain, temporal domain and channel to further strengthen the dependencies between important joints. Finally, the MS-TCN module is used to enlarge the receptive field to extract more latent temporal dependencies. The comprehensive experiments on NTU-RGB+D and NTU-RGB+D 120 datasets demonstrate that our approach possesses outstanding performance in terms of both accuracy and robustness when skeleton samples are incomplete and noisy compared with the state-of-the-art (SOTA) approach. Moreover, the parameters and computational complexity of our model are far less than those of the existing approaches.

Temporal-Aware Graph Convolution Network for Skeleton-based Action Recognition.

An Attentional Spatial Temporal Graph Convolutional Network with Co-Occurrence Feature Learning for Action Recognition

Temporal segment graph convolutional networks for skeleton-based action recognition

Spatio-Temporal Inception Graph Convolutional Networks for Skeleton-Based Action Recognition.

Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition

Adaptive Attention Memory Graph Convolutional Networks for Skeleton-Based Action Recognition

Temporal Enhanced Multi-Stream Graph Convolutional Nerual Networks For Skeleton-Based Action Recognition

Densely Connected and Multiple Temporal Graph Convolution Networks for Skeleton-based Action Recognition

Temporal Attention-Augmented Graph Convolutional Network for Efficient Skeleton-Based Human Action Recognition

Multi-Scale Spatial Temporal Graph Convolutional Network for Skeleton-Based Action Recognition

A Tri-Attention Enhanced Graph Convolutional Network for Skeleton-Based Action Recognition

An improved spatial temporal graph convolutional network for robust skeleton-based action recognition

Spatial‐temporal Slowfast Graph Convolutional Network for Skeleton‐based Action Recognition

Graph transformer network with temporal kernel attention for skeleton-based action recognition

Multi-Scale Adaptive Aggregate Graph Convolutional Network for Skeleton-Based Action Recognition

Attention adjacency matrix based graph convolutional networks for skeleton-based action recognition

TSGCNeXt: Dynamic-Static Multi-Graph Convolution for Efficient Skeleton-Based Action Recognition with Long-term Learning Potential

Multi-Stage Attention-Enhanced Sparse Graph Convolutional Network for Skeleton-Based Action Recognition

TFC-GCN: Lightweight Temporal Feature Cross-Extraction Graph Convolutional Network for Skeleton-Based Action Recognition

Combining channel-wise joint attention and temporal attention in graph convolutional networks for skeleton-based action recognition

Multi-Scale Spatial-Temporal Self-Attention Graph Convolutional Networks for Skeleton-based Action Recognition