Adaptive Positive Sample Selection and Dynamic Soft Label Assignment for Keypoint Detection

Wenxiao Tang,Shiqi Chen,Minghui Wang,M. Saad Shakeel,Jian Jin,Wenxiong Kang,Weisi Lin
DOI: https://doi.org/10.1109/tcsvt.2024.3434563
IF: 5.859
2024-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:Pose estimation plays a crucial role in humancentered vision applications. Some recent efforts achieved pose estimation by keypoints detection. Drawing inspiration from object detection, they treated keypoints as objects and achieved unbiased estimation through implementation of classification and regression heads. However, they still failed to achieve satisfactory performance for detecting heavily occluded keypoints and required elaborate and unavoidable post-processing steps. With a thorough exploration of keypoints’ characteristics, we have developed a novel Adaptive positive Sample selection and dynamic soft Label Assignment (ASLA) scheme tailored for keypoint detection. Specifically, we select positive samples for each keypoint according to the summation distance from the sample coordinates and their predicted coordinates to their corresponding ground truth (GT) in the training phase. For occluded keypoints, the positive samples defined by our method may fall in the semantically relevant regions of pedestrians, rather than the spatially adjacent regions of obstructions, as illustrated in Fig. 1, significantly improving their localization performance. Meanwhile, we dynamically assign classification labels to these positive samples based on the distance between their predicted coordinates and their corresponding GT, which ensures that high quality positive samples are assigned with high classification labels. Benefiting from the practical design of our ASLA, the post-processing step is not essential; however, the simple vector-level post-processing would be the icing on the cake. Finally, we extensively evaluate our ASLA performance on two popular human pose estimation benchmarks, COCO and MPII, and comprehensive experiments show that our ASLA significantly outperforms state-of-the-art algorithms. Our code and models will be available at https://github.com/SCUT-BIPLab/ ASLA.
What problem does this paper attempt to address?