Human Pose Estimation with Regression by Fusing Multi-View Visual Information

Xu Zhao,Yun Fu,Huazhong Ning,Yuncai Liu,Thomas S Huang
2010-01-01
Abstract:We consider the problem of estimating 3D human body pose from visual signals within a discriminative framework. It is challenging because there is a wide gap between complex 3D human motion and planar visual observation, which makes this a severally ill-conditioned problem. In this paper, we focus on three critical factors to tackle human body pose estimation, namely, feature extraction, learning algorithm and camera utilization. On the feature level, we describe images using the salient interest points represented by SIFT-like descriptors, in which the position, appearance, and local structural information are encoded simultaneously. On the learning algorithm level, we propose to use Gaussian processes and multiple linear regression to model the mapping between poses and features. Fusing image information from multiple cameras in different views is of great interest to us on the camera level. We make a comprehensive evaluation on the HumanEva database and get two new insights into the three crucial issues for human pose estimation:(1) Although the choice of feature is very important to the problem, once the learning algorithm becomes efficient, the choice of feature is no longer critical;(2) The impact of information combination from multiple cameras on pose estimation is closely related to not only the quantity of image information, but also its quality. In most cases it’s true that the more information is involved, the better results can be achieved. But when the information quantity is the same, the differences in quality will lead to totally different performance. Furthermore, dense evaluations demonstrate that our approaches are an …
What problem does this paper attempt to address?