Hybrid LSTM and GAN model for action recognition and prediction of lawn tennis sport activities

Xiaolong Sun,Yong Wang,Jawad Khan
DOI: https://doi.org/10.1007/s00500-023-09215-4
IF: 3.732
2023-09-22
Soft Computing
Abstract:Tennis has gained global popularity, prompting a surge in interest towards 3D video-based tennis motion recognition. Early action recognition, which predates activity completion, is a critical classification task to preempt adverse outcomes. Prior research emphasizes effective feature extraction and modeling for swift, accurate classification, despite limited data availability. To establish a robust foundation, this study introduces an anticipatory action prediction module preceding the recognition component. The module forecasts subsequent motions based on observed ones, using an LSTM-GAN structure to mitigate motion blurring and generate predictions. This paper presents an innovative framework that leverages deep learning, particularly dilated neural networks, for real-time spatio-temporal tennis analysis on standard hardware, aiming to enhance player performance insights and action prediction through TensorFlow. The dilated RNN and CNN are integrated into the recognition module for comprehensive spatiotemporal feature modeling. To foster synergy between the prediction and recognition modules, a hard class mining mechanism is devised to enhance the learning capabilities of challenging class samples. As a result, the LSTM architecture combined with GAN provides an excellent 92.1 Precision, 91.2 Recall, 94.5 F-1 score and 95.0 Accuracy in action recognition and prediction of tennis sport, which is significantly higher than classical models i.e. GAN, Conv3DJ, Co-occurrence LSTM, and GAN + L1 + Mining.
computer science, artificial intelligence, interdisciplinary applications
What problem does this paper attempt to address?