Abstract:Skeleton-based human action recognition is a longstanding challenge due to its complex dynamics. Some fine-grain details of the dynamics play a vital role in classification. The existing work largely focuses on designing incremental neural networks with more complicated adjacent matrices to capture the details of joints relationships. However, they still have difficulties distinguishing actions that have broadly similar motion patterns but belong to different categories. Interestingly, we found that the subtle differences in motion patterns can be significantly amplified and become easy for audience to distinct through specified view directions, where this property haven't been fully explored before. Drastically different from previous work, we boost the performance by proposing a conceptually simple yet effective Multi-view strategy that recognizes actions from a collection of dynamic view features. Specifically, we design a novel Skeleton-Anchor Proposal (SAP) module which contains a Multi-head structure to learn a set of views. For feature learning of different views, we introduce a novel Angle Representation to transform the actions under different views and feed the transformations into the baseline model. Our module can work seamlessly with the existing action classification model. Incorporated with baseline models, our SAP module exhibits clear performance gains on many challenging benchmarks. Moreover, comprehensive experiments show that our model consistently beats down the state-of-the-art and remains effective and robust especially when dealing with corrupted data. Related code will be available on https://github.com/ideal-idea/SAP .

A Novel View Attention Network for Skeleton Based Human Action Recognition*

Shifting Perspective to See Difference: A Novel Multi-View Method for Skeleton Based Action Recognition

Spatio-Temporal Attention Deep Network for Skeleton Based View-Invariant Human Action Recognition

View Adaptive Neural Networks for High Performance Skeleton-based Human Action Recognition

View Adaptive Recurrent Neural Networks for High Performance Human Action Recognition from Skeleton Data

MVHANet: Multi-view Hierarchical Aggregation Network for Skeleton-Based Hand Gesture Recognition

Channel attention and multi-scale graph neural networks for skeleton-based action recognition

An Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition

Action Recognition Based on Multi-Level Topological Channel Attention of Human Skeleton

Self-Attention Network for Skeleton-based Human Action Recognition

Enhanced Skeleton Visualization for View Invariant Human Action Recognition.

An attention-aware model for human action recognition on tree-based skeleton sequences

An efficient self-attention network for skeleton-based action recognition

View-Robust Neural Networks for Unseen Human Action Recognition in Videos

Skeleton-Indexed Deep Multi-Modal Feature Learning for High Performance Human Action Recognition

Skeleton Focused Human Activity Recognition in RGB Video

Multiple temporal scale aggregation graph convolutional network for skeleton-based action recognition

An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data

View-Adaptive Graph Neural Network for Action Recognition

Multi-Stage Attention-Enhanced Sparse Graph Convolutional Network for Skeleton-Based Action Recognition

Spatial Temporal Graph Attention Network for Skeleton-Based Action Recognition