Multiple-Hand 2D Pose Estimation From a Monocular RGB Image

Purnendu Mishra,Kishor Sarawadekar
DOI: https://doi.org/10.1109/access.2024.3376426
IF: 3.9
2024-03-27
IEEE Access
Abstract:Deep learning models and algorithms facilitate relatively easier ways of hand pose estimation from monocular RGB images compared to traditional approaches. Despite this, a majority of available algorithms use multiple-stage models to perform hand pose estimation. Moreover, the single-stage methods are mainly limited to a single hand and it is difficult for them to scale to multiple hands. To this end, we propose an approach that takes the features of the saliency map extracted for hand region of interest (ROI) localization. An integrated network uses these features for pose estimation. This arrangement of layers forms an end-to-end pipeline that allows simultaneous pose estimation for multiple hands. The model is designed to run on multiple cores of CPU/GPU to independently perform inference for each detected hands'pose making possible faster inference and hence suitable for real-time applications. In addition, a new approach using grid-based design to estimate hand-keypoints position with high precision is also proposed. Both the proposed designs are validated on multiple datasets to prove their feasibility and effectiveness. The probability of the correct keypoint (PCK) value at threshold value of 0.2 is above 95% on the test sets from Interhand dataset and Rendered HandPose Dataset (RHD).
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?