Self-Supervised Learning for Human Pose Estimation in Sports

Katja Ludwig,Sebastian Scherer,Moritz Einfalt,Rainer Lienhart
DOI: https://doi.org/10.1109/icmew53276.2021.9456000
2021-07-05
Abstract:Human pose estimation (HPE) is a commonly used technique to determine derived parameters that are important to improve the performance of athletes in many sports disciplines. This paper proposes two methods to fine-tune a HPE system trained on general poses to a sports discipline specific HPE model using only a few labeled images. We show that 50 labeled 2D poses and additionally unlabeled videos are sufficient to achieve a Percentage of Correct Keypoints (PCK) of 88.6% at a threshold of 0.1 in the disciplines of triple and long jump, closing the gap between the supervised fine-tuning on the same 50 images and the fully supervised training on $60 \times$ more images by 60%. The first proposed method uses pseudo labels as a self-supervised training technique together with a filtering method of the pseudo labels. Furthermore, this paper shows that a mean teacher approach, which is based on consistency between a teacher and a student model, can also improve the results.
What problem does this paper attempt to address?