SegPoseNet: Segmentation-Guided 3D Hand Pose Estimation

Yunlong Sun,Xiaoyun Chang,Yi Sun,Xiangbo Lin
DOI: https://doi.org/10.1109/icist52614.2021.9440561
2021-01-01
Abstract:This paper presents a multi-task learning architecture named SegPoseNet for 3D hand estimation problem. Previous methods usually gather image features from one perspective, which are unable to make full use of information. In this work, we dig into the hand part segmentation task and introduce semantic guidance information to help the hand pose estimation task. The use of segmentation information is novel but effective in hand pose estimation. Since predicting 3D hand pose directly is hard, we design a two-stage network, which consists of the 2D-StageNet and the 3D-StageNet. For an RGB image, 2D-StageNet carries out 2D hand pose estimation and 2D hand part segmentation at the same time under the multi-task learning mechanism. The 3D-StageNet utilizes the fusion of 2D heatmaps and 2D segmentation silhouette to predict 3D heatmaps, converting them into estimated 3D joint coordinates in the end. We conduct experiments on the pubic hand benchmark called the Rendered Hand Dataset (RHD) and achieve state-of-the-art performance. Extended ablation experiments prove the effectiveness brought by segmentation information.
What problem does this paper attempt to address?