Exploring Self-Supervised Skeleton-Based Human Action Recognition under Occlusions
Yifei Chen,Kunyu Peng,Alina Roitberg,David Schneider,Jiaming Zhang,Junwei Zheng,Ruiping Liu,Yufan Chen,Kailun Yang,Rainer Stiefelhagen
2024-10-23
Abstract:To integrate self-supervised skeleton-based action recognition methods into autonomous robotic systems, it is crucial to consider adverse situations involving target occlusions. Such a scenario, despite its practical relevance, is rarely addressed in existing self-supervised skeleton-based action recognition methods. To empower models with the capacity to address occlusion, we propose a simple and effective method. We first pre-train using occluded skeleton sequences, then use k-means clustering (KMeans) on sequence embeddings to group semantically similar samples. Next, we propose KNN-Imputation to fill in missing skeleton data based on the closest sample neighbors. Imputing incomplete skeleton sequences to create relatively complete sequences as input provides significant benefits to existing skeleton-based self-supervised methods. Meanwhile, building on the state-of-the-art Partial Spatio-Temporal Learning (PSTL), we introduce an Occluded Partial Spatio-Temporal Learning (OPSTL) framework. This enhancement utilizes Adaptive Spatial Masking (ASM) for better use of high-quality, intact skeletons. The new proposed method is verified on the challenging occluded versions of the NTURGB+D 60 and NTURGB+D 120. The source code is publicly available at <a class="link-external link-https" href="https://github.com/cyfml/OPSTL" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Multimedia,Robotics,Image and Video Processing