PEPillar: a point-enhanced pillar network for efficient 3D object detection in autonomous driving

Libo Sun,Yifan Li,Wenhu Qin
DOI: https://doi.org/10.1007/s00371-024-03481-5
IF: 2.835
2024-05-31
The Visual Computer
Abstract:Pillar-based 3D object detection methods outperform traditional point-based and voxel-based methods in terms of speed. However, most of recent methods in this category use simple aggregation techniques to construct pillar feature maps, which leads to a significant loss of raw point cloud detail and a decrease in detection accuracy. Given the critical demand for both rapid response and high precision in autonomous driving, we introduce PEPillar, an innovative 3D object detection method that adopts point cloud data fusion. Concretely, we firstly use the Point-Enhanced Pillar module to learn pillar and keypoints features from the input data. Then attention mechanism is employed to seamlessly integrate features from multiple sources, which improves the model's ability to detect various objects and demonstrates robustness in complex scenarios. Benefiting from the simplicity of pillar representation, PEPillar can use established 2D convolutional neural networks to solve the challenges in backbone network redesign. The Multi-Receptive Field Neck is introduced to enhance the detection accuracy of smaller objects. Additionally, we design the model into a faster single-stage and a more precise two-stage format to meet various requirements. The results of the evaluation indicate a 5.14% improvement of our method compared to the baseline model in the moderately difficult car detection task, achieving levels comparable to state-of-the-art methods that use point and voxel representations.
computer science, software engineering
What problem does this paper attempt to address?