A graph-based approach for absolute 3D hand pose estimation using a single RGB image

Ikram Kourbane,Yakup Genc
DOI: https://doi.org/10.1007/s10489-022-03390-x
IF: 5.3
2022-03-25
Applied Intelligence
Abstract:Monocular RGB-based 3D hand pose estimation is n is crucial for a wide range of augmented reality and human-computer interaction applications. However, this task is highly challenging due to occlusion, scale, and depth ambiguities. Most existing methods mainly focus on estimating a scale-normalized root-relative 3D pose from the cropped hand image. In this work, we propose a multi-stage GCN-based (Graph Convolutional Networks) approach to estimate the absolute 3D hand pose from a single RGB image. We exploit both the cropped hand and the global scene image, which provides clues about the hand scale and location in the camera space. Our network consists of three main stages: 2D key-points, 3D root-relative, and 3D absolute pose estimation. To achieve better performance, we propose a new loss function. It separates the extracted image features based on 3D joint locations to simplify the regression task. Extensive experiments on five public datasets show that our efficient model estimates accurate global 3D hand poses and performs favorably against several baselines and state-of-the-art methods. Also, we validate the proposed approach on a newly created dataset. It contains RGB hand images with accurate 3D pose annotations and high lighting and poses variations.
computer science, artificial intelligence
What problem does this paper attempt to address?