Sport Action Recognition with Siamese Spatio-Temporal CNNs: Application to Table Tennis

Pierre-Etienne Martin,Jenny Benois-Pineau,Renaud Péteri,Julien Morlier,Renaud Peteri
DOI: https://doi.org/10.1109/cbmi.2018.8516488
2018-09-01
Abstract:Human action recognition in video is one of the key problems in visual data interpretation. Despite intensive research, the recognition of actions with low inter-class variability remains a challenge. This paper presents a new Siamese Spatio-Temporal Convolutional neural network (SSTC) for this purpose. When applied to table tennis, it is possible to detect and recognize 20 table tennis strokes. The model has been trained on a specific dataset, TTStroke-21, recorded in natural condition (markerless) at the Faculty of Sports of the University of Bordeaux. Our model takes as inputs a RGB image sequence and its computed Optical Flow. After 3 spatio-temporal convolutions, data are fused in a fully connected layer of a proposed siamese network architecture. Our method reaches an accuracy of 91.4% against 43.1% for our baseline.
What problem does this paper attempt to address?