Monocular 3D object detection via estimation of paired keypoints for autonomous driving
Chaofeng Ji,Guizhong Liu,Dan Zhao
DOI: https://doi.org/10.1007/s11042-021-11801-3
IF: 2.577
2022-01-03
Multimedia Tools and Applications
Abstract:Abstract3D objection detection is a key task in autonomous driving. Because 3D structure information is lost during perspective projection, 3D localization of an object from monocular images is challenging. We herein present a monocular 3D object detection method that formulates the 3D object localization as a paired keypoints regression problem. Our method exploits 2D bounding box priors to predict the projection of paired 3D keypoints on the image plane for each object, and the object localization is recovered via an inverse projection. A fast keypoint regression network is proposed to predict the projection of keypoints and to generate the initial 3D bounding box. Furthermore, to obtain more accurate 3D detection results, we leverage a light-weight cascaded refinement module to rectify the initial 3D box, which takes the instance point cloud converted from the monocular depth prediction as input. Experiments on the KITTI dataset demonstrate that our method exhibits state-of-the-art performance solely via monocular images. Our method achieves 15.97, 10.42, and 7.91 3D AP on the three difficulty levels on the KITTI test set, respectively.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering