Abstract:Human pose estimation has numerous applications in motion recognition, virtual reality, human–computer interaction, and other related fields. However, multi-person pose estimation in crowded and occluded scenes is challenging. One major issue about the current top-down human pose estimation approaches is that they are limited to predicting the pose of a single person, even when the bounding box contains multiple individuals. To address this problem, we propose a novel Crowd and Occlusion-aware Network (CONet) using a divide-and-conquer strategy. Our approach includes a Crowd and Occlusion-aware Head (COHead) which estimates the pose of both the occluder and the occluded person using two separate branches. We also use the attention mechanism to guide the branches for differentiated learning, aiming to improve feature representation. Additionally, we propose a novel interference point loss to enhance the model's anti-interference ability. Our CONet is simple yet effective, and it outperforms the state-of-the-art model by +1.6 AP, achieving 71.6 AP on CrowdPose. Our proposed model has achieved state-of-the-art results on the CrowdPose dataset, demonstrating its effectiveness in improving the accuracy of human pose estimation in crowded and occluded scenes. This achievement highlights the potential of our model in many real-world applications where accurate human pose estimation is crucial, such as surveillance, sports analysis, and human–computer interaction.

Human pose estimation in crowded scenes using Keypoint Likelihood Variance Reduction

3D Point-to-Keypoint Voting Network for 6D Pose Estimation

Based on network probabilistic graph human pose segmentation algorithm

Towards Accurate Human Pose Estimation in Videos of Crowded Scenes

QuickPose: Real-time Multi-view Multi-person Pose Estimation in Crowded Scenes

Rethinking pose estimation in crowds: overcoming the detection information-bottleneck and ambiguity

3D Associative Embedding: Multi-View 3D Human Pose Estimation in Crowded Scenes.

KSL-POSE: A Real-Time 2D Human Pose Estimation Method Based on Modified YOLOv8-Pose Framework

A Lightweight Top-Down Multi-Person Pose Estimation Method Based on Symmetric Transformation and Global Matching

Center point to pose: Multiple views 3D human pose estimation for multi-person

CONet: Crowd and occlusion-aware network for occluded human pose estimation

Shape and Pose Estimation for Closely Interacting Persons Using Multi-view Images.

Hybrid model for Single-Stage Multi-Person Pose Estimation

Robust Pose Estimation in Crowded Scenes with Direct Pose-Level Inference.

A Hybrid Approach for Cross-modality Pose Estimation Between Image and Point Cloud

Improving Human Pose Estimation Based on Stacked Hourglass Network

PoseDet: Fast Multi-Person Pose Estimation Using Pose Embedding

A multilevel object pose estimation algorithm based on point cloud keypoints

Single upper limb pose estimation method based on improved stacked hourglass network

Occluded Human Pose Estimation based on Limb Joint Augmentation

Fast and Robust Multi-Person 3D Pose Estimation from Multiple Views