Multi-Feature Fusion Real-Time Action Recognition Based on 2D to 3D Skeleton

Ren Guoyin,Lu Xiaoqi,Li Yuhao
DOI: https://doi.org/10.3788/LOP202158.2410010
2021-01-01
Laser & Optoelectronics Progress
Abstract:We propose a real-time detection binary sub network based on two-dimensional (2D) to three-dimensional (3D) skeleton, which can realize 3D estimation of key points of 2D skeleton and human 3D motion recognition based on 2D and 3D skeleton feature fusion. In the detection process, OpenPose framework is used to obtain the 2D key point coordinates of human skeleton in video in real time. In the process of 2D to 3D skeleton estimation, a siamese network with difficult input samples and feedback function is designed. In the process of 3D motion recognition, a two branch siamese network of 2D and 3D skeleton features is designed to complete the task of 3D pose recognition. The 3D skeleton estimation network is trained on the Human3. 6M data set, and the skeleton action recognition network is trained on the NTU RGB+D 60 multi view enhancement data set based on Euler transform. Finally, the accuracy of cross subjects and accuracy of cross views are 88. 2% and 95. 6 % Experimental results show that the method has high prediction accuracy for 3D skeleton and real-time feedback ability, and can be applied to action recognition in real-time monitoring.
What problem does this paper attempt to address?