Deep learning for 3D human pose estimation and mesh recovery: A survey

Yang Liu,Changzhen Qiu,Zhiyong Zhang
DOI: https://doi.org/10.1016/j.neucom.2024.128049
2024-07-03
Abstract:3D human pose estimation and mesh recovery have attracted widespread research interest in many areas, such as computer vision, autonomous driving, and robotics. Deep learning on 3D human pose estimation and mesh recovery has recently thrived, with numerous methods proposed to address different problems in this area. In this paper, to stimulate future research, we present a comprehensive review of recent progress over the past five years in deep learning methods for this area by delving into over 200 references. To the best of our knowledge, this survey is arguably the first to comprehensively cover deep learning methods for 3D human pose estimation, including both single-person and multi-person approaches, as well as human mesh recovery, encompassing methods based on explicit models and implicit representations. We also present comparative results on several publicly available datasets, together with insightful observations and inspiring future research directions. A regularly updated project page can be found at <a class="link-external link-https" href="https://github.com/liuyangme/SOTA-3DHPE-HMR" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Multimedia
What problem does this paper attempt to address?
The problems that this paper attempts to solve are the key challenges in 3D human pose estimation (3D HPE) and human mesh recovery (HMR). Specifically, these problems include: 1. **Depth Uncertainty**: In monocular images, different 3D poses may be projected onto the same 2D image, resulting in the uncertainty of depth information. The paper explores how to solve this problem through the characteristics of light propagation and the principle of camera imaging. 2. **Body Structure Understanding**: The structural characteristics of the human body can provide constraints or prior information to improve the performance of pose estimation. The paper discusses how to use methods such as joint relationships, limb - aware networks, and pose grammars to enhance the understanding of body structure. 3. **Occlusion Problem**: It is a common phenomenon that parts of the human body are occluded (including self - occlusion and other occlusions), which will affect the accuracy of pose estimation. The paper introduces techniques such as multi - view methods and probability triangulation modules to solve the occlusion problem. 4. **Insufficient Data**: 3D pose estimation and mesh recovery models usually rely on a large amount of labeled 3D data for training, but the acquisition of these data is costly and time - consuming. The paper explores methods such as unsupervised learning, self - supervised learning, and weakly - supervised learning to reduce the dependence on labeled data. By comprehensively reviewing the latest progress in the past five years (2019 - 2023), covering methods based on explicit models and implicit representations, the paper aims to provide directions and references for future research.