Accelerating Point-Voxel Representation of 3-D Object Detection for Automatic Driving

Jiecheng Cao,Chongben Tao,Zufeng Zhang,Zhen Gao,Xizhao Luo,Sifa Zheng,Yuan Zhu
DOI: https://doi.org/10.1109/tai.2023.3237787
2024-01-01
IEEE Transactions on Artificial Intelligence
Abstract:Current point-voxel fusion methods for 3-D object detection could not make full use of complementary information in the field of autonomous driving. Therefore, a novel two-stage 3-D object detection method, called accelerating point-voxel representation (APVR), is proposed. The advantages of point-based feature and voxel-based feature can be integrated into a single 3-D representation. Thereby, the proposed method retains more fine-grained information of an object while maintaining high efficiency. Specifically, the computational cost is reduced by adding offsets to query neighboring voxels of key-points. More fine-grained information can be obtained by calculating the matching probability between neighboring voxels and key-points. During the optimization of the prediction boxes, virtual grid points are set to capture the spatial information between key-points. The constraint of minimum enclosing rectangle is also added to optimize the directions of the prediction boxes. A large number of experiments on the KITTI, NuScenes, and Waymo datasets demonstrate great generalizability and portability of the proposed approach. The effectiveness and efficiency of APVR have been proved by comparisons with the state-of-the-art methods. APVR makes the real-time processing frame rate reach 40.4 Hz while ensuring high prediction accuracy.
What problem does this paper attempt to address?