Lightweight Multi-Scale Spatiotemporal Graph Convolutional Network for Skeleton-Based Action Recognition

Zhiyun Zheng,Qilong Yuan,Huaizhu Zhang,Yizhou Wang,Junfeng Wang
DOI: https://doi.org/10.1109/cbd63341.2023.00032
2023-01-01
Abstract:Using GCN to model human skeletons into spatiotemporal graphs has achieved exceptional results. It explores the inner connections of human joints. However, the existing methods overlook the remote dependencies between joints, which leads to the lack of flexibility in temporal modeling. Moreover, the existing models are over-parameterized, which increases the computational cost. In this paper, a lightweight multi-scale spatiotemporal graph convolutional network (LMSTGCN) model is proposed. Firstly, a multi-scale spatial graph convolutional network (MSGCN) is designed using a hierarchical strategy, and the input features are divided into multiple subsets along the channel dimension, and various semantic connections between joints are obtained at a low computational cost. Secondly, a dilated convolution process is brought in to the temporal convolution module to acquire a wider effective receptive field without altering the convolution kernel size. Then, a spatiotemporal location attention (STLAtt) module is designed to find the most informative joint in a specific frame from the skeleton sequence, and enhance the models ability to extract and discriminate features in the action sequence. Finally, the multi-stream data fusion is used to enhance the input data and expand the feature information. Experiments indicate that the model can achieve a higher accuracy with a lower computational cost.
What problem does this paper attempt to address?