TSGCNeXt: Dynamic-Static Multi-Graph Convolution for Efficient Skeleton-Based Action Recognition with Long-term Learning Potential

Dongjingdin Liu,Pengpeng Chen,Miao Yao,Yijing Lu,Zijie Cai,Yuxin Tian
2023-04-23
Abstract:Skeleton-based action recognition has achieved remarkable results in human action recognition with the development of graph convolutional networks (GCNs). However, the recent works tend to construct complex learning mechanisms with redundant training and exist a bottleneck for long time-series. To solve these problems, we propose the Temporal-Spatio Graph ConvNeXt (TSGCNeXt) to explore efficient learning mechanism of long temporal skeleton sequences. Firstly, a new graph learning mechanism with simple structure, Dynamic-Static Separate Multi-graph Convolution (DS-SMG) is proposed to aggregate features of multiple independent topological graphs and avoid the node information being ignored during dynamic convolution. Next, we construct a graph convolution training acceleration mechanism to optimize the back-propagation computing of dynamic graph learning with 55.08\% speed-up. Finally, the TSGCNeXt restructure the overall structure of GCN with three Spatio-temporal learning modules,efficiently modeling long temporal features. In comparison with existing previous methods on large-scale datasets NTU RGB+D 60 and 120, TSGCNeXt outperforms on single-stream networks. In addition, with the ema model introduced into the multi-stream fusion, TSGCNeXt achieves SOTA levels. On the cross-subject and cross-set of the NTU 120, accuracies reach 90.22% and 91.74%.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that in skeleton - action recognition, the existing graph convolutional network (GCNs) methods have two main bottlenecks: First, the construction of complex learning mechanisms leads to low training efficiency; second, the ability to handle long - time - series data is insufficient. Specifically: 1. **Complex learning mechanisms**: Recent works tend to introduce complex dynamic graph learning mechanisms. Although these mechanisms can improve the performance of the model, they also add redundant training processes at the same time, reducing the training efficiency. Especially when dealing with long - time - series data, this inefficiency becomes more obvious. 2. **Bottlenecks in long - time - series learning**: In order to ensure the number of parameters and computational efficiency, existing methods usually reduce the length of the time series, which leads to the loss of fine - grained time information. In addition, some methods will have the problem of decreasing accuracy when learning long - time - series, limiting the model's ability to learn long - time - series data. In response to the above problems, the paper proposes **Temporal - Spatio Graph ConvNeXt (TSGCNeXt)**, aiming to solve these problems through the following improvements: - **New graph learning mechanism**: The **Dynamic - Static Separated Multi - Graph Convolution (DS - SMG)** module is proposed to aggregate the features of multiple independent topological graphs and avoid ignoring node information during the dynamic convolution process. - **Graph convolution training acceleration mechanism**: The back - propagation calculation of dynamic graph learning is optimized, and the training speed is increased by 55.08% compared with traditional methods. - **Overall structure optimization**: The overall structure of GCN is reconstructed, and three spatio - temporal learning modules are designed to effectively model long - time features. Through these improvements, TSGCNeXt not only performs excellently on single - stream networks, but also can reach the state - of - the - art level when multi - stream fusion. Experimental results show that the performance of TSGCNeXt on the large - scale datasets NTU RGB + D 60 and 120 is better than existing methods, especially in the processing of long - time - series data.