Abstract:3D hand pose estimation from a single depth image plays an important role in computer vision and human-computer interaction. Although recent hand pose estimation methods using convolution neural network (CNN) have shown notable improvements in accuracy, most of them have a limitation that they rely on a complex network structure without fully exploiting the articulated structure of the hand. A hand, which is an articulated object, is composed of six local parts: the palm and five independent fingers. Each finger consists of sequential-joints that provide constrained motion, referred to as a kinematic chain. In this paper, we propose a hierarchically-structured convolutional recurrent neural network (HCRNN) with six branches that estimate the 3D position of the palm and five fingers independently. The palm position is predicted via fully-connected layers. Each sequential-joint, i.e. finger position, is obtained using a recurrent neural network (RNN) to capture the spatial dependencies between adjacent joints. Then the output features of the palm and finger branches are concatenated to estimate the global hand position. HCRNN directly takes the depth map as an input without a time-consuming data conversion, such as 3D voxels and point clouds. Experimental results on public datasets demonstrate that the proposed HCRNN not only outperforms most 2D CNN-based methods using the depth image as their inputs but also achieves competitive results with state-of-the-art 3D CNN-based methods with a highly efficient running speed of 285 fps on a single GPU.

Hand Pose Estimation Using Convolutional Neural Networks and Support Vector Regression.

Dual Regression for Efficient Hand Pose Estimation

Real-Time 3D Hand Pose Estimation with 3D Convolutional Neural Networks

NETWORKS EFFECTIVELY UTILIZING 2D SPATIAL INFORMATION FOR ACCURATE 3D HAND POSE ESTIMATION

Hand pose estimation in depth image using CNN and random forest

Accurate 3D Hand Pose Estimation Network Utilizing Joints Information.

A CNN Model for Real Time Hand Pose Estimation.

Hand Pose Regression Via a Classification-Guided Approach

Hand3D: Hand Pose Estimation using 3D Neural Network

Fast and Accurate 3D Hand Pose Estimation via Recurrent Neural Network for Capturing Hand Articulations

HMTNet:3D Hand Pose Estimation from Single Depth Image Based on Hand Morphological Topology

Pose Estimation Using Convolutional Neural Network with Synthesis Depth Data

Region Ensemble Network: Improving Convolutional Network for Hand Pose Estimation

Joint Hand Detection and Rotation Estimation Using CNN.

Human Pose Estimation Based on Step Deep Convolution Neural Network.

Deep Predictive Neural Network: Unsupervised Learning for Hand Pose Estimation

GHand - A Graph Convolution Network for 3D Hand Pose Estimation.

Hand Pose Estimation with Attention-and-Sequence Network.

Joint Hand Detection and Rotation Estimation by Using CNN

Learning a Deep Predictive Coding Network for a Semi-Supervised 3D-Hand Pose Estimation

Improved Modular Convolution Neural Network for Human Pose Estimation