Abstract:Currently, human action recognition has witnessed remarkable progress, and its achievements have been applied to daily life. However, most methods extract features from only a single view within each modality, which may not comprehensively capture the diversity and complexity of actions. Moreover, the ineffective removal of redundant information can result in an inconspicuous description of key information. These issues cloud affect the final action recognition accuracy. To address these issues, this paper proposes a novel method for single-subject routine action recognition, which combines multi-view key information representation and multi-modal fusion. Firstly, the energy of non-primary motion areas is reduced by motion mean normalization in the depth video sequence, thereby enhancing key information of action. Then, depth motion history map (DMHM) and depth spatio-temporal energy map (DSTEM) are extracted from planes and axes, respectively. The proposed DMHM effectively preserves the spatio-temporal information of actions, DSTEM preserves the motion contour and energy information. In terms of skeleton sequences, statistical features and motion contribution degree of each joint are extracted from the view of motion distribution and weights, respectively. Finally, depth and skeleton features are fused to achieve multi-modal fusion-based action recognition. The proposed method highlights the information of the main motion areas, and achieves recognition accuracies of 96.70 on MSR-Action3D, 93.26 on UTD-MHAD, and above 97.73 on all tests of CZU-MHAD. The experimental results demonstrate that the proposed method effectively preserves action information and has better recognition accuracy than most existing methods.

Mining 3d Key-Pose-Motifs for Action Recognition

Online Robust Action Recognition Based on a Hierarchical Model

Action Recognition from Arbitrary Views Using 3D-Key-pose Set

Improved Key Poses Model for Skeleton-Based Action Recognition.

Action Recognition Based on Global Optimal Similarity Measuring

Attribute Mining for Scalable 3D Human Action Recognition

Shifting Perspective to See Difference: A Novel Multi-View Method for Skeleton Based Action Recognition

Recognizing Actions In 3d Using Action-Snippets And Activated Simplices

Mining Mid-level Features for Action Recognition Based on Effective Skeleton Representation

Mining Spatial and Spatio-Temporal ROIs for Action Recognition

An optimization method of human skeleton keyframes selection for action recognition

Kpose: A New Representation For Action Recognition

An effective representation for action recognition with human skeleton joints

Learning Discriminative Activated Simplices for Action Recognition

ActionPose: Pretraining 3D Human Pose Estimation with the Dark Knowledge of Action

An Approach to Pose-Based Action Recognition

Revealing Key Details to See Differences: A Novel Prototypical Perspective for Skeleton-based Action Recognition

3D Action Recognition Using Multi-Temporal Skeleton Visualization.

Multi-view key information representation and multi-modal fusion for single-subject routine action recognition

Human 3D Model-based 2D Action Recognition

On the Utility of 3D Hand Poses for Action Recognition