ARFA: Adaptive Reception Field Aggregation for 3-D Detection from LiDAR Point Cloud

Diankun Zhang,Xueqing Wang,Zhijie Zheng,Xiaojun Liu,Guangyou Fang
DOI: https://doi.org/10.1109/jsen.2022.3230947
IF: 4.3
2022-01-01
IEEE Sensors Journal
Abstract:Submanifold convolution is widely used in 3-D detection. However, it brings different receptive fields to voxels due to the nonuniform distribution in Light Detection and Ranging (LiDAR) point clouds, resulting in degradation of the feature extraction ability for distant voxels and the performance of detectors. We propose a solution, adaptive receptive field aggregation (ARFA) network, an end-to-end two-stage LiDAR 3-D object detection architecture. ARFA searches the top- ${K}$ nearest neighbors (KNNs) to adaptively adjust the receptive field of sparse voxels, followed by a self-attention aggregation (SA) module with density feature embedding (DE) to aggregate the semantic information in the receptive field. In order to further strengthen the detection performance for small objects, we also propose an upsampling bird’s eyes view (U-BEV) backbone and a Intersection over Union (IoU)-aware head to enhance the quality of the proposals and rectify the confidence of the predicted bounding boxes. ARFA outperforms the state-of-the-art methods on the Waymo Open dataset and achieves competitive results on the popular KITTI dataset.
What problem does this paper attempt to address?