Abstract:Tennis has becoming an increasingly popular sport throughout the world. Tennis motion recognition based on 3D video has attracted more and more attention in recent years. The algorithm based on dynamic time warping takes into account the timing sequence information of movements and can solve the uncertainty of human movement at temporal level. By increasing the training samples, the efficiency will decrease accordingly. This work presents a tennis action recognition framework based on action standard sequence. The 3D action video samples are incorporated into action sequences by feature extraction, wherein the action standard sequences are encoded as a sequence averaging optimization problem under the dynamic time normalization metric. The dynamic time normalization barycenter averaging algorithm (DBA) is leveraged to solve this problem. For the tennis scenery with significant differences in the action categories, we study the standard sequence learning of multiple actions, and accordingly propose a DBA-K-means clustering algorithm for unsupervised learning. Herein, a human tennis action recognition by integrating feature optimization and image similarity is proposed. The three dimensional reduction methods, including principal component analysis (PCA),PCA + Pearson, and PCA+ Spearman, were compared to prove that PCA+ Pearson correlation coefficient had the best dimensional reduction effect. Meanwhile, the global feature eight-star model is combined with the local feature HOG feature after dimensionally reduced to fully represent human movements. The similarity between pairwise adjacent frames of images was calculated. The statistical weight of single frame SVM classification results within a discriminant period is adaptively allocated, and finally the body pose recognition results are classified twice. Experiments on standard data set KTH show that the recognition accuracy of this algorithm is 94.5%, which is better than other methods. It has a good application value in the field of video human motion recognition. Also, we have demonstrated that this method can further improve the efficiency and accuracy of action recognition. Effective feature extraction is beneficial to improve the accuracy of subsequent human action recognition.

Strategy for dynamic 3D depth data matching towards robust action retrieval.

Action retrieval based on generalized dynamic depth data matching.

Attention-driven Action Retrieval with DTW-based 3d Descriptor Matching.

Spatio-Temporal Depth Recovery of Dynamic Scenes with Multiple Handheld Cameras

3D Action Recognition Using Multi-Temporal Depth Motion Maps and Fisher Vector

A Depth Extraction Method Based on Motion and Geometry for 2D to 3D Conversion

Depth2Action: Exploring Embedded Depth for Large-Scale Action Recognition

Temporally Consistent Depth Map Estimation Based On 3d-Mrf

Stereoscopic video conversion based on depth tracking

Deep Spatial/temporal-level feature engineering for Tennis-based action recognition

Multi-Temporal Depth Motion Maps-Based Local Binary Patterns for 3-D Human Action Recognition

Real-Time Human Action Recognition System Using Depth Map Sequences

Spatio-Temporal Segmentation with Depth-Inferred Videos of Static Scenes

Action Recognition from Depth Sequences Using Weighted Fusion of 2D and 3D Auto-Correlation of Gradients Features

Human action recognition for 3D video based on action standard sequence

Action Recognition for Depth Video using Multi-view Dynamic Images

Depth Context: a New Descriptor for Human Activity Recognition by Using Sole Depth Sequences

Monocular Piecewise Depth Estimation in Dynamic Scenes by Exploiting Superpixel Relations

D^3epth: Self-Supervised Depth Estimation with Dynamic Mask in Dynamic Scenes

Human action recognition using Adaptive Hierarchical Depth Motion Maps and Gabor filter

A Bayesian framework for dense depth estimation based on spatial-temporal correlation