Learning Informative Pairwise Joints with Energy-Based Temporal Pyramid for 3D Action Recognition

Mengyuan Liu,Chen,Hong Liu
DOI: https://doi.org/10.1109/icme.2017.8019313
2017-01-01
Abstract:This paper presents an effective local spatial-temporal de-scriptor for action recognition from skeleton sequences. The unique property of our descriptor is that it takes the spatial-temporal discrimination and action speed variations into account, intending to solve the problems of distinguishing similar actions and identifying actions with different speeds in one goal. The entire algorithm consists of two stages. First, a frame selection method is used to remove noisy skeletons for a given skeleton sequence. From the selected skeletons, skeleton joints are mapped to a high dimensional space, where each point refers to kinematics, time label and joint label of a skeleton joint. To encode relative relationships among joints, pairwise points from the space are then jointly mapped to a new space, where each point encodes the relative relationships of skeleton joints. Second, Fisher Vector (FV) is employed to encode all points from the new space as a compact feature representation. To cope with speed variations in actions, an energy-based temporal pyramid is applied to form a multi-temporal FV representation, which is fed into a kernel-based extreme learning machine classifier for recognition. Extensive experiments on benchmark datasets consistently show that our method outperforms state-of-the-art approaches for skeleton-based action recognition.
What problem does this paper attempt to address?