PPF-Det: Point-Pixel Fusion for Multi-Modal 3D Object Detection

Guotao Xie,Zhiyuan Chen,Ming Gao,Manjiang Hu,Xiaohui Qin
DOI: https://doi.org/10.1109/tits.2023.3347078
IF: 8.5
2024-01-01
IEEE Transactions on Intelligent Transportation Systems
Abstract:Multi-modal fusion can take advantage of the LiDAR and camera to boost the robustness and performance of 3D object detection. However, there are still of great challenges to comprehensively exploit image information and perform accurate diverse feature interaction fusion. In this paper, we proposed a novel multi-modal framework, namely Point-Pixel Fusion for Multi-Modal 3D Object Detection (PPF-Det). The PPF-Det consists of three submodules, Multi Pixel Perception (MPP), Shared Combined Point Feature Encoder (SCPFE), and Point-Voxel-Wise Triple Attention Fusion (PVW-TAF) to address the above problems. Firstly, MPP can make full use of image semantic information to mitigate the problem of resolution mismatch between point cloud and image. In addition, we proposed SCPFE to preliminary extract point cloud features and point-pixel features simultaneously reducing time-consuming on 3D space. Lastly, we proposed a fine alignment fusion strategy PVW-TAF to generate multi-level voxel-fused features based on attention mechanism. Extensive experiments on KITTI benchmarks, conducted on September 24, 2023, demonstrate that our method shows excellent performance.
engineering, electrical & electronic,transportation science & technology, civil
What problem does this paper attempt to address?