Abstract:Human action recognition from videos is a challenging task in computer vision. In recent years, histogram-based descriptors that are calculated along dense trajectories have shown promising results for human action recognition, but they usually ignore motion information of the tracking points, and the relationship between different motion variables is not well utilized. To address this issue, we propose a motion keypoint trajectory (MKT) approach and a trajectory-based covariance (TBC) descriptor, which is calculated along the motion keypoint trajectories. The proposed MKT approach tracks motion keypoints at multiple spatial scales and employs an optical flow rectification algorithm to reduce the influence of camera motions and thus achieves better performance than the improved dense trajectory (IDT) approach well known in the literature. In particular, MKT is faster than IDT, because MKT does not need to use human detection and extracts fewer trajectories than IDT. Furthermore, the TBC descriptor outperforms the classical histogram-based descriptors, such as the Histogram of Oriented Gradient, Histogram of Optical Flow and Motion Boundary Histogram. Experimental results on three challenging datasets (i.e., Olympic Sports, HMDB51 and UCF50) demonstrate that our approach is able to achieve better recognition performances than a number of state-of-the-art approaches.

Trajectory-Based 3D Convolutional Descriptors for Human Action Recognition

Trajectory-Pooled 3D Convolutional Descriptors for Action Recognition

Action Recognition with Trajectory-Pooled Deep-Convolutional Descriptors

Action Recognition with Multi-Scale Trajectory-Pooled 3D Convolutional Descriptors

Body Joint Guided 3-D Deep Convolutional Descriptors for Action Recognition

Action Recognition with Joints-Pooled 3D Deep Convolutional Descriptors

Human Action Recognition with Trajectory Based Covariance Descriptor in Unconstrained Videos

Learning Deep Trajectory Descriptor for Action Recognition in Videos Using Deep Neural Networks.

Sequential Deep Trajectory Descriptor for Action Recognition with Three-stream CNN.

Deep Trajectory for Recognition of Human Behaviours

Action Recognition Based on Joint Trajectory Maps Using Convolutional Neural Networks

Human Action Recognition by Fast Dense Trajectories.

Video Representation by Dense Trajectories Motion Map Applied to Human Activity Recognition

Body Joint guided 3D Deep Convolutional Descriptors for Action Recognition

Motion keypoint trajectory and covariance descriptor for human action recognition

Learning 3D Compact Binary Descriptor for Human Action Recognition in Video.

Human Action Recognition Based on Point Context Tensor Shape Descriptor

Action Recognition with Multiple Relative Descriptors of Trajectories

Action Recognition Based on Joint Trajectory Maps with Convolutional Neural Networks

Residual Frames with Efficient Pseudo-3D CNN for Human Action Recognition

3D-TDC: A 3D temporal dilation convolution framework for video action recognition