Improving Real-Time Omnidirectional 3D Multi-Person Human Pose Estimation with People Matching and Unsupervised 2D-3D Lifting

Pawel Knap,Peter Hardy,Alberto Tamajo,Hwasup Lim,Hansung Kim
2024-03-14
Abstract:Current human pose estimation systems focus on retrieving an accurate 3D global estimate of a single person. Therefore, this paper presents one of the first 3D multi-person human pose estimation systems that is able to work in real-time and is also able to handle basic forms of occlusion. First, we adjust an off-the-shelf 2D detector and an unsupervised 2D-3D lifting model for use with a 360$^\circ$ panoramic camera and mmWave radar sensors. We then introduce several contributions, including camera and radar calibrations, and the improved matching of people within the image and radar space. The system addresses both the depth and scale ambiguity problems by employing a lightweight 2D-3D pose lifting algorithm that is able to work in real-time while exhibiting accurate performance in both indoor and outdoor environments which offers both an affordable and scalable solution. Notably, our system's time complexity remains nearly constant irrespective of the number of detected individuals, achieving a frame rate of approximately 7-8 fps on a laptop with a commercial-grade GPU.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper attempts to address the problem of achieving multi-person pose estimation in omnidirectional 3D space under real-time conditions. Current human pose estimation systems mainly focus on obtaining accurate 3D global estimates of a single person from a single viewpoint. This study proposes a new 3D multi-person pose estimation system that can operate in real-time environments and handle basic forms of occlusion. Specifically, the researchers adapted an off-the-shelf 2D detector and an unsupervised 2D-3D lifting model to accommodate the use of 360-degree panoramic cameras and millimeter-wave radar sensors. Additionally, they introduced several contributions, including camera and radar calibration, as well as improvements in matching people in image and radar spaces. These improvements not only enhance matching accuracy but also reduce the absolute error in placing poses within the 3D coordinate system, providing a cost-effective and scalable solution for both indoor and outdoor environments.