Traffic police command gesture recognition technology based on machine vision and two-stream spatio-temporal attention graph convolutional network

Yuan Li
DOI: https://doi.org/10.1117/12.2684174
2023-08-01
Abstract:For the requirement of automatic recognition of traffic police gestures in complex backgrounds based on vision sensors for driverless cars, we propose a method for traffic police gesture action recognition based on two-stream spatio-temporal attention graph convolutional network (2s-AGCN) with two different dimensional skeletal data. Firstly, detect the commanding traffic policeman in the video, extract the 2D and 3D skeletal data with the pose estimation algorithm to reduce the influence of complex background and joint overlap on action recognition, then, build the spatio-temporal graph model ; After that, we construct a 2s-AGCN network, input 2D and 3D skeletal sequences into the network to learn the spatio-temporal features of gesture actions. Finally, a fusion of the two-stream information is done and then output the final traffic police gesture category. 2s-AGCN uses Non-Local and TopK at the spatial level to focus on all nodes directly, selecting the strongest K neighbors of interaction strength; Temporal attention is used to focus on the frames that have higher contribution. The ablation study is done on the dataset CTPGD, and the results show that the method significantly improves the recognition accuracy of traffic police command gesture actions, especially those with overlapping skeleton points.
Engineering,Computer Science
What problem does this paper attempt to address?