A lightweight attention-driven distillation model for human pose estimation

Falai Wei,Xiaofang Hu
DOI: https://doi.org/10.1016/j.patrec.2024.08.009
IF: 4.757
2024-08-26
Pattern Recognition Letters
Abstract:Currently, research on human pose estimation tasks primarily focuses on heatmap-based and regression-based methods. However, the increasing complexity of heatmap models and the low accuracy of regression methods are becoming significant barriers to the advancement of the field. In recent years, researchers have begun exploring new methods to transfer knowledge from heatmap models to regression models. Recognizing the limitations of existing approaches, our study introduces a novel distillation model that is both lightweight and precise.In the feature extraction phase, we design the Channel-Attention-Unit (CAU), which integrates group convolution with an attention mechanism to effectively reduce redundancy while maintaining model accuracy with a decreased parameter count. During distillation, we develop the attention loss function, LA , which enhances the model's capacity to locate key points quickly and accurately, emulating the effect of additional transformer layers and boosting precision without the need for increased parameters or network depth. Specifically, on the CrowdPose test dataset, our model achieves 71.7% mAP with 4.3M parameters, 2.2 GFLOPs, and 51.3 FPS. Experimental results demonstrates the model's strong capabilities in both accuracy and efficiency, making it a viable option for real-time posture estimation tasks in real-world environments.
computer science, artificial intelligence
What problem does this paper attempt to address?