Aggregated Pyramid Gating Network for Human Pose Estimation Without Pre-Training

Chenru Jiang,Kaizhu Huang,Shufei Zhang,Xinheng Wang,Jimin Xiao,Yannis Goulermas
DOI: https://doi.org/10.1016/j.patcog.2023.109429
IF: 8
2023-01-01
Pattern Recognition
Abstract:In this work, we propose a comprehensive aggregated residual gating structure, the Pyramid GAting Net-work (PGA-Net) for human pose estimation which can select, distill, and fuse semantic level and natural level information from multiple scales. In comparison, through utilizing multi-scale features, most ex -isting state-of-the-art pose estimation methods are still limited in three aspects. First, multi-scale fea-tures contain massively redundant information, which is unfortunately not distilled by most existing approaches. Second, preferring deeper network structures to extract strong semantic features, the con-ventional methods often ignore original texture information fusion. Third, to attain a good parameter initialization, the current methods heavily rely on pre-training, which is very time-consuming or even unavailable. While better coping with the above problems, our proposed PGA-Net distills high-level se-mantic features and replenishes low-level original information to reinforce module representation capa-bility. Meanwhile, PGA-Net demonstrates notable training stability and superior performance even with-out pre-training. Extensive experiments demonstrate that our method consistently outperforms previous approaches even without pre-training, enabling thus an end-to-end model training from scratch. In COCO benchmark, PGA-Net consistently achieves over 3% improvements than the baseline (without pre-training) under various model configurations.1 (c) 2023 Elsevier Ltd. All rights reserved.
What problem does this paper attempt to address?