Cross-Modal 3D Object Detection and Tracking for Auto-Driving

Yihan Zeng,Chao Ma,Ming Zhu,Zhiming Fan,Xiaokang Yang
DOI: https://doi.org/10.1109/iros51168.2021.9636498
2021-01-01
Abstract:Detecting and tracking objects in 3D scenes play crucial roles in autonomous driving. Successfully recognizing objects through space and time hinges on a strong detector and a reliable association scheme. Recent 3D detection and tracking approaches widely represent objects as points when associating detection results with trajectories. Despite the demonstrated success, these approaches do not fully exploit the rich appearance information of objects. In this paper, we present a conceptually simple yet effective algorithm, named AlphaTrack, which considers both the location and appearance changes to perform joint 3D object detection and tracking. To achieve this, we propose a cross-modal fusion scheme that fuses camera appearance feature with LiDAR feature to facilitate 3D detection and tracking. We further attach an additional branch to the 3D detector to output instance-aware appearance embedding, which significantly improves tracking performance with our designed association mechanisms. Extensive validations on large-scale autonomous driving dataset demonstrate the effectiveness of the proposed algorithm in comparison with state-of-the-art approaches. Notably, the proposed algorithm ranks first on the nuScenes tracking leaderboard to date.
What problem does this paper attempt to address?