Object Detection Based on Fusion of Sparse Point Cloud and Image Information

Xiaobin Xu,Lei Zhang,Jian Yang,Chenfei Cao,Zhiying Tan,Minzhou Luo
DOI: https://doi.org/10.1109/tim.2021.3102739
IF: 5.6
2021-01-01
IEEE Transactions on Instrumentation and Measurement
Abstract:With the rapid development of mobile robots, environmental perception based on a single sensor can hardly meet the task requirements of the robots for object detection and path planning in complex scenarios. In this article, an object detection fusion algorithm based on both the information of the LiDAR point cloud and the camera image is proposed. First, YOLOv4 is used to detect the objects in the image. Then, the point cloud is projected into the image, and the target point cloud is filtered out according to the range of the 2-D detection frame. The target point cloud is used to perform density clustering and generate the output of a bounding box with semantic labels. Meanwhile, the original point cloud data are processed by an improved four-neighborhood clustering algorithm based on the Euclidean distance and angle threshold to generate the output of another bounding box without semantic labels. Finally, the clusters obtained by different methods are fused and judged to produce the output of the final object detection results. The test using the KITTI dataset shows that the accuracy of the improved four-neighbor clustering algorithm is increased to 0.835. The final semantic segmentation results have an average positioning error of 0.033 and 0.073 m in the $x$ - and $y$ -directions. The average angular error of the vehicle direction is 0.90°. Compared with the other two types of point cloud segmentation networks, our approach has the highest accuracy and sufficient real-time performance, which can reach 9.96 Hz in the experiment.
What problem does this paper attempt to address?