BEVEFNet: A Multiple Object Tracking Model Based on LiDAR-Camera Fusion

Yi Yuan,Ying Liu
DOI: https://doi.org/10.1016/j.procs.2024.08.106
2024-01-01
Procedia Computer Science
Abstract:As a crucial task in the field of computer vision, object tracking models are widely used in various application domains, such as autonomous driving. However, existing multiple object tracking methods still face challenges in accurately and efficiently tracking moving multi-targets in real time. This paper presents BEVEFNet, a camera-LiDAR multi-target tracking model based on multistage fusion, which effectively utilizes the semantic information from optical images and the spatial and geometric information from LiDAR data to unify multi-modal features in a shared Bird’s Eye View(BEV) representation space. By leveraging LiDAR data to complement optical images, multi-level fusion is achieved at both the feature and decision levels. The proposed efficient sparse 3D feature extraction network significantly enhances the speed of multiple object tracking by incorporating sparse convolution. Experiments conducted on the nuSences dataset demonstrate that BEVEFNet achieves an AMOTA of 69.7, improving the accuracy of multiple object tracking.
What problem does this paper attempt to address?