Two-Stream Auto-Encoder Network for Unsupervised Skeleton-Based Action Recognition

Gang Wang,Yaonan Guan,Dewei Li
DOI: https://doi.org/10.1007/s12204-023-2619-6
2023-01-01
Journal of Shanghai Jiaotong University (Science)
Abstract:Representation learning from unlabeled skeleton data is a challenging task. Prior unsupervised learning algorithms mainly rely on the modeling ability of recurrent neural networks to extract the action representations. However, the structural information of the skeleton data, which also plays a critical role in action recognition, is rarely explored in existing unsupervised methods. To deal with this limitation, we propose a novel two-stream autoencoder network to combine the topological information with temporal information of skeleton data. Specifically, we encode the graph structure by graph convolutional network (GCN) and integrate the extracted GCN-based representations into the gate recurrent unit stream. Then we design a transfer module to merge the representations of the two streams adaptively. According to the characteristics of the two-stream autoencoder, a unified loss function composed of multiple tasks is proposed to update the learnable parameters of our model. Comprehensive experiments on NW-UCLA, UWA3D, and NTU-RGBD 60 datasets demonstrate that our proposed method can achieve an excellent performance among the unsupervised skeleton-based methods and even perform a similar or superior performance over numerous supervised skeleton-based methods.
What problem does this paper attempt to address?