Balanced Representation Learning for Long-tailed Skeleton-based Action Recognition

Hongda Liu,Yunlong Wang,Min Ren,Junxing Hu,Zhengquan Luo,Guangqi Hou,Zhenan Sun
2023-08-27
Abstract:Skeleton-based action recognition has recently made significant progress. However, data imbalance is still a great challenge in real-world scenarios. The performance of current action recognition algorithms declines sharply when training data suffers from heavy class imbalance. The imbalanced data actually degrades the representations learned by these methods and becomes the bottleneck for action recognition. How to learn unbiased representations from imbalanced action data is the key to long-tailed action recognition. In this paper, we propose a novel balanced representation learning method to address the long-tailed problem in action recognition. Firstly, a spatial-temporal action exploration strategy is presented to expand the sample space effectively, generating more valuable samples in a rebalanced manner. Secondly, we design a detached action-aware learning schedule to further mitigate the bias in the representation space. The schedule detaches the representation learning of tail classes from training and proposes an action-aware loss to impose more effective constraints. Additionally, a skip-modal representation is proposed to provide complementary structural information. The proposed method is validated on four skeleton datasets, NTU RGB+D 60, NTU RGB+D 120, NW-UCLA, and Kinetics. It not only achieves consistently large improvement compared to the state-of-the-art (SOTA) methods, but also demonstrates a superior generalization capacity through extensive experiments. Our code is available at <a class="link-external link-https" href="https://github.com/firework8/BRL" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is **the data imbalance problem in skeletal action recognition under long - tail distribution**. Specifically: 1. **Skewness of sample space caused by data imbalance**: In the real world, the distribution of action categories often exhibits long - tail characteristics, that is, common action categories (head classes) have a large number of samples, while rare action categories (tail classes) have only a small number of samples. This data imbalance will cause the model to be biased towards the samples of head classes, resulting in a significant decline in the recognition performance of tail classes. 2. **Bias problem in representation learning**: Due to the unbalanced distribution of data, existing action recognition methods tend to encode the discriminative features of head classes, which distorts the representation space of tail classes and causes the classifier to be easily misclassified. To solve these problems, the paper proposes a new **balanced representation learning method**, aiming to alleviate the challenges brought by long - tail distribution through the following two aspects: - **Generate more valuable samples**: Propose a Spatial - Temporal Action Exploration strategy. Through techniques such as Rebalanced Partial Mixup and Temporal Reverse Perception, generate more valuable skeletal action samples to supplement the skewed sample space. - **Alleviate representation bias**: Design a Detached Action - Aware Learning Schedule. By introducing Action - Aware Loss to reduce representation bias, and separate the learning process of specific patterns of tail classes from the general knowledge, so as to more effectively constrain different classes. In addition, the paper also proposes a Skip - Modal Representation to provide additional structural information and further improve the generalization ability of the model. Through these methods, the paper has not only achieved significant performance improvements on multiple skeletal data sets, but also demonstrated its superior generalization ability.