Abstract:The existing approaches for skeleton-based action recognition based on graph convolutional networks (GCNs) primarily emphasize the construction of human skeletal structure by leveraging inherent connections. However, the static skeletal topology used across all action categories fails to capture discriminative relationships between joint pairs, while current graph structures struggle to model dynamic motion information, limiting their ability to represent both temporal and motion-specific dependencies. To address this limitation, we propose the decoupled static-dynamic co-occurrence graph convolution (DSDC-GConv), which specifically aims to learn and adapt the graph topology by refining the inter-frame and intra-frame joint dependencies through decomposed manner. Additionally, a multi-level context-aware module is proposed to comprehensively model the latent saliencies of multiple domains in skeletal sequences. This module refines the spatial nodes, temporal dynamics, channel-wise characteristics, and motional dependencies within the graph convolution block. Furthermore, a hierarchical densely connected temporal convolution is proposed to enhance the representation of local features through partial dense connections and enrich the temporal information during the convolution process. Findings from our evaluations on five large-scale benchmark datasets (i.e., NTU RGB+D 60, NTU RGB+D 120, Kinetics Skeleton 400, Northwestern-UCLA, PKU-MMD) demonstrate the effectiveness and superiority of our proposed method over competing approaches, with an recognition accuracy of 93.0% and 97.1% on NTU RGB+D 60, 89.9% and 90.6% on NTU RGB+D 120, 38.6% and 63.4% on Kinetics Skeleton 400, 97.4% on Northwestern-UCLA, 97.6% and 63.6% on PKU-MMD.

Dynamic Semantic-Based Spatial-Temporal Graph Convolution Network for Skeleton-Based Human Action Recognition

Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition

DSDC-GCN: Decoupled Static-Dynamic Co-occurrence Graph Convolutional Networks for Skeleton-Based Action Recognition

Graph-Temporal LSTM Networks for Skeleton-Based Action Recognition

Spatio-Temporal Inception Graph Convolutional Networks for Skeleton-Based Action Recognition.

An improved spatial temporal graph convolutional network for robust skeleton-based action recognition

Spatial-Temporal Adaptive Graph Convolutional Network for Skeleton-Based Action Recognition.

Dynamic spatial-temporal topology graph network for skeleton-based action recognition

Spatio-Temporal Graph Convolution for Skeleton Based Action Recognition.

Densely Connected and Multiple Temporal Graph Convolution Networks for Skeleton-based Action Recognition

Dynamic Hypergraph Convolutional Networks for Skeleton-Based Action Recognition

Learning Graph Convolutional Network for Skeleton-Based Human Action Recognition by Neural Searching.

Temporal Refinement Graph Convolutional Network for Skeleton-based Action Recognition

Multi-Stage Attention-Enhanced Sparse Graph Convolutional Network for Skeleton-Based Action Recognition

Temporal Enhanced Multi-Stream Graph Convolutional Nerual Networks For Skeleton-Based Action Recognition

Self-Relational Graph Convolution Network for Skeleton-Based Action Recognition

Skeleton-Based Action Recognition with Spatial-Structural Graph Convolution

Temporal‐enhanced graph convolution network for skeleton‐based action recognition

Human Skeleton Feature Optimizer and Adaptive Structure Enhancement Graph Convolution Network for Action Recognition

Selective Hypergraph Convolutional Networks for Skeleton-based Action Recognition

Pose-Guided Graph Convolutional Networks for Skeleton-Based Action Recognition