Dynamic Gesture Recognition Method Based on Improved R(2+1)D

Yupeng Huo,Jie Shen,Sheng Zhang,Li Wang
DOI: https://doi.org/10.1109/cvidl58838.2023.10166120
2023-01-01
Abstract:Currently, methods based on 3D convolutional neural networks have made significant progress in the field of dynamic gesture recognition. Dynamic gestures are highly redundant in both the temporal and spatial dimensions, and the complex environment during the recognition process can easily affect the final recognition results. Therefore, it is crucial to make the model focus on the important moments and regions of gesture movements and extract relevant salient spatiotemporal features to further improve model performance. To address this issue, this paper proposes a lightweight Temporal-Spatial-Channel attention (TSCA) module based on the R(2+1)D network. The module consists of two sub-modules: a Temporal-Channel attention (TCA) module and a Temporal-Spatial attention (TSA) module, with the goal of enabling the model to focus on important information along the spatial, channel, and temporal dimensions during gesture movements. Finally, the TSCA attention module is integrated into the R(2+1)D network, resulting in only a 2.8M increase in parameters, and achieves good performance on the IPN-Hand and NvGesture datasets.
What problem does this paper attempt to address?