Global Co-occurrence Feature Learning and Active Coordinate System Conversion for Skeleton-based Action Recognition

Sheng Li,Tingting Jiang,Tiejun Huang,Yonghong Tian
DOI: https://doi.org/10.1109/wacv45572.2020.9093618
2020-01-01
Abstract:Skeleton-based action recognition has attracted more and more attention in recent years. Besides, the rapid development of deep learning has greatly improved the performance. However, the current exploration of action co-occurrence is still not comprehensive enough. Most existing works only mine co-occurrence features from the temporal or spatial domain seperately, and it's common to combine them in the end. Different from previous works, our approach is able to learn temporal and spatial co-occurrence features integratedly and globally, which is called spatio-temporal-unit feature enhancement (STUFE). In order to better align the skeleton data, we introduce a novel method for skeleton data preprocessing called active coordinate system conversion (ACSC). A coordinate system can be learned automatically to transform skeleton samples for alignment. By the way, the proposed methods are compatible with current two types of mainstream models, the CNN-based and GCN-based models. Finally, on the two benchmarks of NTU-RGB+D and SBU Kinect Interaction, we validated our methods based on two mainstream models. The results show that our methods achieve the state-of-the-art.
What problem does this paper attempt to address?