Spatiotemporal adaptive attention 3D multiobject tracking for autonomous driving

Xiaofei Zhang,Zhengping Fan,Xiaojun Tan,Qunming Liu,Yanli Shi
DOI: https://doi.org/10.1016/j.knosys.2023.110442
IF: 8.139
2023-05-01
Knowledge-Based Systems
Abstract:Three-dimensional (3D) multiobject tracking (MOT) is an essential perception task for autonomous vehicles (AVs). Studies have indicated that multimodal data fusion can provide more stable and efficient perception information to AVs than a single sensor. Therefore, this paper proposes a new spatiotemporal adaptive attention 3D (3DSTAA) tracker, which attempts to improve the tracking performance of the end-to-end 3D MOT by adaptively correlating spatiotemporal data. The novelty of this paper includes the following. (1) Different from nonintelligent fusion methods, this paper uses an efficiently adaptive spatial-guided fusion (SGFus) module for multimodal feature fusion. As a result, the 3D structural information obtained from point cloud data can provide additional spatial information as complementary information to the 2D texture information extracted from the image data, collaboratively facilitating and refining the perception information representation in the margin area. (2) This paper develops a spatiotemporal object-unique attention (STOUA) module that calculates the relational degree of each perceived object between two adjacent frames through attentional encoding. At the same time, an adaptive weighting strategy is used to further study the spatiotemporal correlation of unique objects, reducing the similarity among various objects and the differences across the same object. Experiments tested using the KITTI tracking benchmark show that the 3DSTAA tracker is highly competitive in both inference time and tracking performance compared with state-of-the-art (SOTA) methods. Our corresponding code will be released on the https://github.com/xf-zh.
computer science, artificial intelligence
What problem does this paper attempt to address?