Efficient High-Resolution High-Level-Semantic Representation Learning for Human Pose Estimation

Hong Liu,Lisi Guan
DOI: https://doi.org/10.1109/icpr48806.2021.9412886
2021-01-01
Abstract:Ensuring that features contain both high-resolution and high-level-semantic information is important for human pose estimation, while most existing methods suffer from spatial information loss or semantic information mismatch when extracting high-resolution high-level-semantic features. To efficiently address these issues, we propose a novel Dilation Pyramid Module (DPM), which can enlarge the receptive field multiplicatively to extract high-level-semantic information as subsampling without reducing spatial resolution. DPM is composed of several consecutive dilated convolution layers of which dilation radius is specially designed to enlarge the receptive field multiplicatively and avoid the gridding issue of dilated convolution. Based on DPM, the Dilation Pyramid Net (DPN) is proposed to efficiently extract high-resolution high-level-semantic features. We experimentally demonstrate the effectiveness and efficiency of the proposed DPN with competitive performance to the state-of-the-art methods over two challenging benchmark datasets: the COCO keypoint detection dataset and the MPII Human Pose dataset.
What problem does this paper attempt to address?