Cascaded hierarchical CNN for 2D hand pose estimation from a single color image
Mingyue Zhang,Zhiheng Zhou,Ming Deng
DOI: https://doi.org/10.1007/s11042-022-12780-9
IF: 2.577
2022-03-24
Multimedia Tools and Applications
Abstract:Due to severe articulation, self-occlusion, various scales, and high dexterity of the hand, hand pose estimation is more challenging than body pose estimation. Recently-developed body pose estimation algorithms are not suitable for addressing the unique challenges of hand pose estimation because they are trained without explicitly modeling structural relationships between keypoints. In this paper, we propose a novel cascaded hierarchical CNN(CH-HandNet) for 2D hand pose estimation from a single color image. The CH-HandNet includes three modules, hand mask segmentation, preliminary 2D hand pose estimation, and hierarchical estimation. The first module obtains a hand mask by hand mask segmentation network. The second module connects the hand mask and the intermediate image features to estimate the 2D hand heatmaps. The last module connects hand heatmaps with the intermediate image features and hand mask to estimate finger and palm heatmaps hierarchically. Finally, the extracted Finger(pinky,ring,middle,index) and Palm(thumb and palm) feature information are fused to estimate 2D hand pose. Experimental results on three datasets - OneHand 10k, Panoptic, and Eric.Lee, consistently shows that our proposed CH-HandNet outperforms previous state-of-the-art hand pose estimation methods.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering