DCAPose: Improve One-Stage Multi-Person Pose Estimation with Dynamic Center Assignment

Wei Zhang,Huiru Xie,Qi Li,Zhen Sun
DOI: https://doi.org/10.1109/FG59268.2024.10582017
2024-05-27
Abstract:Single-stage methods for multi-person pose estimation have gained significant attention for their ability to concurrently localize person positions and perceive body structure in a single processing step. However, existing single-stage methods often rely on hand-crafted centers to represent the position of human instances. Such a simplification tends to overlook the intricate structure of the human body, resulting in a misalignment between the designated centers and the actual instance centers, which consequently leads to suboptimal performance. In this paper, we introduce DCAPose, a straightforward yet powerful pipeline to address this issue by redefining the process of center selection as a set prediction problem. Rather than directly supervising the center positions of instances, our approach considers each position on the center map as a potential instance candidate. We utilize a skeleton-aware bipartite matching loss to facilitate one-to-one matching between the poses of the candidate set and the ground truths. Additionally, we introduce a novel bidirectional hierarchical body representation to capture human structural information more accurately. Our method eliminates the need for Non-Maximum Suppression, greatly simplifying the processing pipeline and enabling end-to-end optimization. Extensive testing on challenging benchmarks COCO and CrowdPose confirms that DCAPose surpasses other leading single-stage methods, demonstrating the effectiveness and superiority of our framework.
Computer Science
What problem does this paper attempt to address?