Deep Conditional Variational Estimation for Depth-Based Hand Poses

Lu Xu,Chen Hu,Yinqi Li,Ji'an Tao,Jianru Xue,Kuizhi Mei
DOI: https://doi.org/10.1109/fg.2019.8756559
2019-01-01
Abstract:We propose a novel and effective approach for 3D hand pose estimation on single depth image. Instead of doing deterministic regression from depth images, our model focuses on learning a latent distribution to model the high dimensional space of pose joints, which can also be interpreted as a kinematics model for human hands. Specifically, the proposed network combines the framework of conditional variational autoencoder which learns an encoder and a decoder with standard convolutional network. The encoder models the latent variable as a prior or a regularization for the pose joints. Then probabilistic inference is performed by the decoder to generate the output prediction conditioned on input depth images. In addition, we introduce a pool-convolution module to improve the localization regression of the network. The architecture can be trained end-to-end. In experiments, we demonstrate the effectiveness of our proposed approach in comparison to various state-of-art holistic regression approaches.
What problem does this paper attempt to address?