Towards Scalable Scenarios Human Pose Estimation Via Two-Stage Hierarchical Network

QiKun Yang,Ming Liu,Lingqin Kong,Yuejin Zhao,Liquan Dong,Mei Hui,Zhongyi Fan
DOI: https://doi.org/10.1117/12.2643759
2022-01-01
Abstract:Human pose estimation is a key step in understanding human behavior in images and videos. Bottom-up human pose estimation methods are difficult to predict the correct pose of a person in large scenes due to the challenge of scale variation. In this paper we propose a two-stage hierarchical network that first acquires images in large scenes, and sends tracking command signals to a two-degree-of-freedom shooting platform equipped with an image sensor to track a moving target based on a motion target detection frame, and locally constrains the captured image stream according to a top-down target detection algorithm to retain only the content related to the motion target in the image. The processed images are fed into the generalized human pose estimation model for pose detection. We deployed the algorithm on a two-degree-of-freedom filming platform equipped with camera equipment and deployed the experimental platform to sport scenes to conduct detection experiments on sport figures in running and ski jumping sport scenes, using the sport figure and its nearby area as the ROI region to generate pictures or videos with the skeleton pose of the sport target to guide the sport training of the target figure. This investigation can solve the challenge of scale variation to some extent in bottom-up multi-human pose estimation, especially for large scenes where the person key points can be located more accurately. The experiments show that this investigation can meet the practical use requirements of speed and accuracy of sport figure pose detection in large scenes of daily sports.
What problem does this paper attempt to address?