SGFNet: Segmentation Guided Fusion Network for 3D Object Detection.

Yunlong Wang,Kun Jiang,Tuopu Wen,Xinyu Jiao,Benny Wijaya,Jinyu Miao,Yining Shi,Zheng Fu,Mengmeng Yang,Diange Yang
DOI: https://doi.org/10.1109/lra.2023.3326697
IF: 5.2
2023-01-01
IEEE Robotics and Automation Letters
Abstract:The self-driving application requires accurate 3D object detection as it is essential in several tasks, such as path and motion planning. However, up until this point, fusion-based detectors with cameras and LiDAR sensors have always been inferior to LiDAR-only detectors. This can be attributed to the dual scene representation problem caused by the differentiated modality of LiDAR points and images. Moreover, the projection of the image pixels is not guaranteed to reach its point cloud counterparts due to the sparsity of the points, losing image content in the fusion process. Bearing these in mind, we propose Segmentation Guided Fusion Network (SGFNet), an efficient multi-sensor fusion-based 3D object detector. It first separates feature extractions of images and points with unified high-dimensional feature representation through the novel-proposed auxiliary foreground segmentation head, and then projects hierarchical feature maps instead of the raw image pixels onto points to obtain the unified feature map, achieving consistent data modality. Such image feature maps are with several spatial resolutions to keep more image content during the projection process. Finally, the unified feature map is fed into a fusion-based region proposal module and bounding box regression head to generate accurate 3D bounding boxes. Extensive experiments conducted on KITTI and nuScenes datasets demonstrate that SGFNet achieves competitive performance on fusion-based 3D object detection tasks and reports a novel state-of-the-art in terms of 3D average precision metric.
What problem does this paper attempt to address?