TEMI-MOT: Towards Efficient Multi-Modality Instance-Aware Feature Learning for 3D Multi-Object Tracking

Yufeng Hu,Sanping Zhou,Jinpeng Dong,Nanning Zheng
DOI: https://doi.org/10.1109/IJCNN54540.2023.10191718
2023-01-01
Abstract:3D multi-object tracking is one of the key technologies of autonomous driving, which aims to ensure that autonomous driving vehicles accurately perceive the movements and intentions of surrounding traffic participants. In recent years, some 3D multi-object tracking methods based on multi-modality have been proposed. Although these methods improve the accuracy of object association in the tracking process, these methods are still difficult to effectively deal with the problems of feature ambiguity due to occlusion, incorrect feature alignment between different modalities, and confusion of adjacent target features caused by coarse-grained feature maps. To address these problems, we propose a new multi-modality feature learning method for 3D multi-object tracking, named TEMI-MOT, which is composed of three modules in series: the point-guided image feature sampler, the instance-aware feature encoder, and the tracking pipeline. The point-guided feature sampler realizes the alignment between the point cloud and image features, the instance-aware feature encoder fuses the aligned image features with each object's points to generate the discriminative instance-aware features, and the tracking pipeline finally outputs the results based on instance-aware features and G-IoU geometric similarities. Our approach achieves state-of-the-art results on the nuScenes dataset among the methods using CenterPoint detections. The experimental results show that the proposed method has better robustness and effectiveness for 3D multi-object tracking.
What problem does this paper attempt to address?