A Deconvolutional Bottom-up Deep Network for Multi-Person Pose Estimation.

Meng Li,Haoqian Wang,Yongbing Zhang,Yi Yang
DOI: https://doi.org/10.1109/ist48021.2019.9010189
2019-01-01
Abstract:Due to the trade off between model complexity and estimation accuracy, current human pose estimators either are of low accuracy or requires long running time. Such dilemma is especially severe in real time multi-person pose estimation. To address this issue, we design a deep network of reduced parameter size and high estimation accuracy, via introducing deconvolution layers instead of widely used fully-connected configuration. Specifically, our model consists of two successive parts: detection network and matching network. The former outputs keypoint heatmap and person locations, and then the latter produces the final pose estimation using multiple deconvolutional layers. Benefiting from the simple structure and explicit utilization of previously neglected spatial structure in heatmap, the matching network is of specially high efficiency and at high precision. Experiments on the challenging COCO dataset demonstrate our method can almost cut off the running parameters of matching network, while achieving higher accuracy than existing methods.
What problem does this paper attempt to address?