Human Pose Estimation Based on Step Deep Convolution Neural Network.

Wenxia Bao,Yaping Yang,Dong Liang,Ming Zhu
DOI: https://doi.org/10.1109/CISP-BMEI.2018.8633110
2018-01-01
Abstract:To improve the accuracy and robustness of human pose estimation, a step deep convolution neural network is proposed, which consists of a feed forward module and several step modules. Image features output from the last four layers of feed forward module are fused with the context features, the fused features are used as the input of the first step module; Image features of each step module fused with context features are used as the input of the next step module; The confidence map output from the last step module is used to predict joint position. This stepwise approach increases the receptive fields, which is good to learn the long-distance relationship between joints and predicting the position of occluded joints. At the same time, the confidence calculated by the previous module provides more and more accurate estimation of the position of each joint in the subsequent modules. In addition, the network also provides a way to strengthen the intermediate supervision by learning the objective function, so as to supplement the gradient of back-propagation and adjust the learning process, which effectively solves problem that the gradient disappears during training. Our approach is tested on two standard datasets of Leeds Sports Poses (LSP) and Frames Labeled In Cinema (FLIC), whose results indicate that our network has better performance in pose estimation of human body.
What problem does this paper attempt to address?