FSFNet: Foreground Score-Aware Fusion for 3-D Object Detector under Unfavorable Conditions

Jia Lin,Huilin Yin,Jun Yan,Kaifeng Jian,Yu Lu,Wancheng Ge,Hao Zhang,Gerhard Rigoll
DOI: https://doi.org/10.1109/jsen.2023.3283018
IF: 4.3
2023-01-01
IEEE Sensors Journal
Abstract:Nowadays, various multimodal fusion-based 3-D object detectors appear to provide a potential opportunity to solve the failure cases in single-modality methods. However, current fusion approaches still face some unfavorable factors, e.g., poor illumination driving conditions and crowded traffic scenarios, which will cause unsatisfying image quality and objects’ occlusion. To this end, we present a multimodal fusion network FSFNet consisting of local graph-aware point cloud backbone (LGB), foreground score-aware fusion network (FSFN), and the proposals’ refining loss (PRL) for the 3-D object detection task in this article. Concretely, the directed graph is built to generate edgewise features for each point, and the point features are supplemented with graph information in LGB. To alleviate the defect of undesirable image quality features caused by poor illumination condition, FSFN is designed to produce an adaptive multimodal feature by taking pointwise foreground scores into consideration. Hence, levelwise point features with high confidence are fully used, and the imperfect image information is suppressed in fusion stage. We further introduce PRL to reduce the false positive and false negative cases in crowded scenes by optimizing the location and scores of predicted 3-D bounding boxes. Extensive experiments conducted on the KITTI benchmark demonstrate that FSFNet owns its superiority over state-of-the-art networks. Moreover, FSFN is verified to be robust against the image inputs under poor illumination conditions.
What problem does this paper attempt to address?