Dynamic Hand Gesture Recognition Based on 3D Hand Pose Estimation for Human-Robot Interaction

Qing Gao,Yongquan Chen,Zhaojie Ju,Yi Liang
DOI: https://doi.org/10.1109/jsen.2021.3059685
IF: 4.3
2021-01-01
IEEE Sensors Journal
Abstract:Dynamic hand gesture recognition is a challenging problem in the area of hand-based human–robot interaction (HRI), such as issues of a complex environment and dynamic perception. In the context of this problem, we learn from the principle of the data-glove-based hand gesture recognition method and propose a dynamic hand gesture recognition method based on 3D hand pose estimation. This method uses 3D hand pose estimation, data fusion and deep neural network to improve the recognition accuracy of dynamic hand gestures. First, a 2D hand pose estimation method based on OpenPose is improved to obtain a fast 3D hand pose estimation method. Second, the weighted sum fusion method is utilized to combine the RGB, depth and 3D skeleton data of hand gestures. Finally, a 3DCNN + ConvLSTM framework is used to identify and classify the combined dynamic hand gesture data. In the experiment, the proposed method is verified on a developed dynamic hand gesture database for HRI and gets 92.4% accuracy. Comparative experiment results verify the reliability and efficiency of the proposed method.
engineering, electrical & electronic,instruments & instrumentation,physics, applied
What problem does this paper attempt to address?
This paper attempts to address several key issues in dynamic gesture recognition, particularly those encountered in vision-based human-robot interaction (HRI). Specifically: 1. **Dynamic Perception in Complex Environments**: Existing methods struggle to accurately recognize dynamic gestures in complex backgrounds and environments. 2. **Extraction of Spatiotemporal Features**: Dynamic gestures contain spatial and temporal features, which pose challenges to existing static gesture recognition methods. 3. **Accuracy of Hand Pose Estimation**: Accurately detecting and estimating hand poses is crucial for dynamic gesture recognition. To address these issues, the authors propose a dynamic gesture recognition method based on 3D hand pose estimation. This method is implemented through the following steps: 1. **Improved 2D Hand Pose Estimation**: First, an improved OpenPose method is used to obtain fast 2D hand pose estimation from RGB images. 2. **Data Fusion**: RGB images, depth images, and 3D skeleton data are fused through weighted summation to fully utilize multimodal information. 3. **Feature Extraction and Classification**: Finally, a 3DCNN + ConvLSTM framework is used to recognize and classify the fused dynamic gesture data. Experimental results show that this method achieved an accuracy of 92.4% on the developed dynamic gesture database, validating its reliability and effectiveness.