An enhanced real-time human pose estimation method based on modified YOLOv8 framework

Chengang Dong,Guodong Du

DOI: https://doi.org/10.1038/s41598-024-58146-z

IF: 4.6

2024-04-07

Scientific Reports

Abstract:The objective of human pose estimation (HPE) derived from deep learning aims to accurately estimate and predict the human body posture in images or videos via the utilization of deep neural networks. However, the accuracy of real-time HPE tasks is still to be improved due to factors such as partial occlusion of body parts and limited receptive field of the model. To alleviate the accuracy loss caused by these issues, this paper proposes a real-time HPE model called based on the YOLOv8 framework. Specifically, we have improved the backbone and neck of the YOLOv8x-pose real-time HPE model to alleviate the feature loss and receptive field constraints. Secondly, we introduce the context coordinate attention module (CCAM) to augment the model's focus on salient features, reduce background noise interference, alleviate key point regression failure caused by limb occlusion, and improve the accuracy of pose estimation. Our approach attains competitive results on multiple metrics of two open-source datasets, MS COCO 2017 and CrowdPose. Compared with the baseline model YOLOv8x-pose, CCAM-Person improves the average precision by 2.8% and 3.5% on the two datasets, respectively.

multidisciplinary sciences

What problem does this paper attempt to address?

The paper aims to address several key issues in real-time Human Pose Estimation (HPE): 1. **Inaccurate keypoint localization due to limited receptive field**: Existing real-time HPE methods suffer from inaccurate keypoint localization due to limited receptive field or loss of original features. 2. **Pose estimation failure due to occlusion**: When human body parts are occluded, existing methods struggle to accurately estimate the pose. To solve these problems, the paper proposes an improved model based on the YOLOv8 framework—CCAM-Person. Specifically, the model is optimized through the following points: 1. **Multi-Scale Receptive Field Module (MRF)**: Introduces the MRF module in the Backbone part to aggregate more low-level features, improving the accuracy of human pose estimation at different scales. 2. **Multi-Path Feature Pyramid Network (MFPN)**: Replaces the original PANet structure to achieve more efficient cross-layer feature fusion, reducing information loss. 3. **Contextual Coordinate Attention Module (CCAM)**: Enhances the focus on salient features, reduces background noise interference, and alleviates keypoint regression failure caused by limb occlusion, thereby improving pose estimation accuracy. With these improvements, the CCAM-Person model outperforms the baseline model YOLOv8x-pose on two open-source datasets (MS COCO 2017 and CrowdPose), with an average precision improvement of 2.8% and 3.5%, respectively.

An enhanced real-time human pose estimation method based on modified YOLOv8 framework

Context-Guided Adaptive Network for Efficient Human Pose Estimation.

KSL-POSE: A Real-Time 2D Human Pose Estimation Method Based on Modified YOLOv8-Pose Framework

Research on Human Posture Estimation Algorithm Based on YOLO-Pose

MDA-YOLO Person: a 2D human pose estimation model based on YOLO detection framework

Unified End-to-End YOLOv5-HR-TCM Framework for Automatic 2D/3D Human Pose Estimation for Real-Time Applications

Classroom Student Posture Recognition Based on an Improved High-Resolution Network.

YOLO-Rlepose: Improved YOLO Based on Swin Transformer and Rle-Oks Loss for Multi-Person Pose Estimation

Human Pose Estimation Based on Lightweight Multi-Scale Coordinate Attention

A study of human pose estimation in low-light environments using YOLOv8 model

3D Human Pose Estimation Based on Wearable IMUs and Multiple Camera Views

Optimized S2E Attention Block based Convolutional Network for Human Pose Estimation

Deep Dual Consecutive Network for Human Pose Estimation

Object Pose Estimation Based on Improved YOLOX Algorithm

HCA-YOLO: a non-salient object detection method based on hierarchical attention mechanism

Single-Stage Pose Estimation and Joint Angle Extraction Method for Moving Human Body

Hope: heatmap and offset for pose estimation

Improved YOLOv8 Model for a Comprehensive Approach to Object Detection and Distance Estimation

A Compact and Powerful Single-Stage Network for Multi-Person Pose Estimation

Efficient Monocular Human Pose Estimation Based on Deep Learning Methods: A Survey

Parallel Self-Attention and Spatial-Attention Fusion for Human Pose Estimation and Running Movement Recognition