Real-Time 3D Pedestrian Tracking with Monocular Camera

Peng Xiao,Fei Yan,Jiannan Chi,Zhiliang Wang
DOI: https://doi.org/10.1155/2022/7437289
2022-01-01
Wireless Communications and Mobile Computing
Abstract:Target tracking has always been a popular research area in computer vision, and many important methods have been proposed. However, most methods can only solve partial and slight occlusion. If the target is lost, a common solution is to keep detecting, reidentify the target when it reappears, and then link the broken tracks together, but this makes tracking discontinuous. There are two key points in this problem: continuous tracking and occlusion judgment. In this paper, we propose a target tracking method with a short-time prediction function to solve this problem. For continuous tracking, we establish a 3D dynamic model to estimate the motion state of the target in each frame. For occlusion judgment, we use a depth prediction network to estimate the depth of the target and then determine whether the target is occluded by the depth. Without relying on depth sensors or multiple cameras, we achieve depth estimation using only a single monocular image, which greatly expands the application of our method. Benefit from the introduction of motion estimation and depth prediction, the tracking accuracy of our method has been significantly improved, especially for better robustness to occlusion. Even when the target is completely occluded, it can be tracked for a short time without reidentification. In addition, we improve the speed of depth prediction through knowledge distillation by 2.08 times, and the final tracking speed reaches 52.6Hz on GPU, which meets the real-time tracking requirements.
What problem does this paper attempt to address?