Attention Mechanism Based on Improved Spatial-Temporal Convolutional Neural Networks for Traffic Police Gesture Recognition

Zhixuan Wu,Nan Ma,Yue Gao,Jiahong Li,Xinkai Xu,Yongqiang Yao,Li Chen
DOI: https://doi.org/10.1142/s0218001422560018
IF: 1.261
2022-01-01
International Journal of Pattern Recognition and Artificial Intelligence
Abstract:Human action recognition has attracted extensive research efforts in recent years, in which traffic police gesture recognition is important for self-driving vehicles. One of the crucial challenges in this task is how to find a representation method based on spatial-temporal features. However, existing methods performed poorly in spatial and temporal information fusion, and how to extract features of traffic police gestures has not been well researched. This paper proposes an attention mechanism based on the improved spatial-temporal convolutional neural network (AMSTCNN) for traffic police gesture recognition. This method focuses on the action part of traffic police and uses the correlation between spatial and temporal features to recognize traffic police gestures, so as to ensure that traffic police gesture information is not lost. Specifically, AMSTCNN integrates spatial and temporal information, uses weight matching to pay more attention to the region where human action occurs, and extracts region proposals of the image. Finally, we use Softmax to classify actions after spatial-temporal feature fusion. AMSTCNN can strongly make use of the spatial-temporal information of videos and select effective features to reduce computation. Experiments on AVA and the Chinese traffic police gesture datasets show that our method is superior to several state-of-the-art methods.
What problem does this paper attempt to address?