Multi-Human Pose Estimation by Deep Learning-Based Sequential Approach for Human Keypoint Position and Human Body Detection

Rizwan Tahir,Yunze Cai
DOI: https://doi.org/10.1007/s12204-023-2658-z
2023-01-01
Journal of Shanghai Jiaotong University (Science)
Abstract:Recent multimedia and computer vision research has focused on analyzing human behavior and activity using images. Skeleton estimation, known as pose estimation, has received a significant attention. For human pose estimation, deep learning approaches primarily emphasize on the keypoint features. Conversely, in the case of occluded or incomplete poses, the keypoint feature is insufficiently substantial, especially when there are multiple humans in a single frame. Other features, such as the body border and visibility conditions, can contribute to pose estimation in addition to the keypoint feature. Our model framework integrates multiple features, namely the human body mask features, which can serve as a constraint to keypoint location estimation, the body keypoint features, and the keypoint visibility via mask region-based convolutional neural network (Mask-RCNN). A sequential multi-feature learning setup is formed to share multi-features across the structure, whereas, in the Mask-RCNN, the only feature that could be shared through the system is the region of interest feature. By two-way up-scaling with the shared weight process to produce the mask, we have addressed the problems of improper segmentation, small intrusion, and object loss when Mask-RCNN is used, for instance, segmentation. Accuracy is indicated by the percentage of correct keypoint, and our model can identify 86.1% of the correct keypoints.
What problem does this paper attempt to address?