Abstract:Human behaviour recognition is an important research direction in the field of computer vision, with broad application prospects in areas such as human–computer interaction, smart healthcare, video surveillance, and sports motion analysis. However, current skeleton‐based behaviour recognition methods using graph convolutional networks still face some challenges, such as the difficulty of fully utilizing the dependencies among distant nodes and distinguishing similar actions. To address the limitations of existing graph convolution‐based models in distinguishing similar actions, a multi‐stream hierarchical perception graph convolutional network model that incorporates angle features is proposed. This model introduces four new angle feature representations to capture subtle variations in different body parts, providing discriminative features to differentiate action details. Additionally, it utilizes a key angle feature enhancement module to strengthen important angle features for specific actions. The model achieves recognition accuracies of 92.8% and 96.8% under the cross‐subject and cross‐view evaluation criteria of the NTU‐RGB+D dataset, respectively, and attains accuracies of 89.2% and 90.8% under the cross‐subject and cross‐setup evaluation criteria of the NTU‐RGB+D 120 dataset. The experimental results validate that angle information effectively enhances the model's accuracy and improves its ability to distinguish similar actions. Distinguishing similar actions has been a challenging challenge in skeleton‐based action recognition. Since the joint coordinates in these actions are similar, it is difficult to accomplish the recognition task using traditional joint features. To address this issue, the use of angle features to capture subtle nuances in various body parts, along with a critical angle enhancement module that assigns weights to different angle feature representations for a given action are proposed, highlighting the critical angle feature representation. The approach is evaluated using a three‐stream ensemble method on three large action recognition datasets, NTU‐RGB+D, NTU‐RGB+D 120, and Kinetics‐400. The experimental results demonstrate that incorporating angular information can effectively complement joint and skeletal features, leading to improved recognition of similar actions and enhanced model performance and robustness.

When Skeleton Meets Motion: Adaptive Multimodal Graph Representation Fusion for Action Recognition

[Carbapenem antibiotics].

Human-centric multimodal fusion network for robust action recognition

Fusion-GCN: Multimodal Action Recognition using Graph Convolutional Networks

B2C-AFM: Bi-Directional Co-Temporal and Cross-Spatial Attention Fusion Model for Human Action Recognition.

Evaluating fusion of RGB-D and inertial sensors for multimodal human action recognition

Skeleton Sequence and RGB Frame Based Multi-Modality Feature Fusion Network for Action Recognition

Symmetrical Enhanced Fusion Network for Skeleton-Based Action Recognition

Multi-Modality Co-Learning for Efficient Skeleton-based Action Recognition

Multimodal human action recognition based on spatio-temporal action representation recognition model

Skeleton Focused Human Activity Recognition in RGB Video

Attention-Based Multilevel Co-Occurrence Graph Convolutional LSTM for 3-D Action Recognition

Fusing angular features for skeleton‐based action recognition using multi‐stream graph convolution network

Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition

MGSAN: multimodal graph self-attention network for skeleton-based action recognition

Occlusion-Aware Graph Neural Networks for Skeleton Action Recognition

Skeleton-Indexed Deep Multi-Modal Feature Learning for High Performance Human Action Recognition

Multi-Scale Adaptive Aggregate Graph Convolutional Network for Skeleton-Based Action Recognition

MFGCN: an efficient graph convolutional network based on multi-order feature information for human skeleton action recognition

Fusing Higher-Order Features in Graph Neural Networks for Skeleton-Based Action Recognition

Multiple temporal scale aggregation graph convolutional network for skeleton-based action recognition