Heterogeneous Network Building and Embedding for Efficient Skeleton-Based Human Action Recognition

Qi Yang,Tianlong Wang,Qicong Wang,Yunqi Lei
DOI: https://doi.org/10.1109/PRAI55851.2022.9904108
2022-01-01
Abstract:In the methods of human action recognition, modeling the human skeleton as a dual-stream adaptive graph convolutional network has achieved remarkable results. However, based on unimodal data, they cannot exploit the potential semantic correlation between different modalities. In the existing knowledge distillation methods, the model independently extracts the characteristics of each modality, and there are no rich information interactions between different modalities. Moreover, most models use the same network architecture to extract features of different modalities, which largely limits the distinguishability between modalities. In this work, we propose a novel model based on heterogeneous network building and embedding. We build a heterogeneous network to unify features of different modalities which process by different networks. Moreover, we also proposed a multimodal similarity loss function based on deep metric learning, which closely links the samples of the same kind and stays away from the samples of different kinds. Extensive experiments on the two datasets, NTU RGB+D and UTD-MHAD, demonstrate that our model effectively improves the performance of unimodal methods, and is also comparable to other multimodal methods.
What problem does this paper attempt to address?