CWF-HRNet: Context-aware Fusion for Human Pose Estimation Based on High-Resolution Networks
Shiyu Ding,Jin Li,Kuan Luan,Hong Liang,Jiqing Xing
DOI: https://doi.org/10.1109/icma61710.2024.10632900
2024-01-01
Abstract:This study proposes a novel CWF-HRNet, based on dynamic perception fusion, to enhance human pose estimation for volleyball coaching visual aids. It addresses key challenges in real-world teaching scenarios, such as blurred image details, inadequate multi-resolution information fusion, and limb occlusion. The CWF-HRNet integrates the CARAFE upsampling module, which dynamically adjusts pixel contributions using context-related weight matrices based on image content. To improve multi-resolution information fusion, the HL-WASP multi-scale fusion module combines shallow features from different layers, expanding the feature receptive field and enhancing detail capture. Additionally, a context-based dynamic perception module enhances global semantic understanding to address occlusions. Experimental results demonstrate that CWF-HRNet improves evaluation metrics AP, AP 50 , AP 75 , AP M , AP L , and AR by 4.1%, 2.6%, 3.5%, 3.6%, 4.4%, and 3.4%, respectively, compared to the HRNet model, while reducing parameters by 14%. These enhancements underscore the superior accuracy and efficiency of CWF-HRNet in challenging scenarios, making it well-suited for real-time sports training applications.