Abstract:Three-dimensional (3D) multiobject tracking (MOT) is an essential perception task for autonomous vehicles (AVs). Studies have indicated that multimodal data fusion can provide more stable and efficient perception information to AVs than a single sensor. Therefore, this paper proposes a new spatiotemporal adaptive attention 3D (3DSTAA) tracker, which attempts to improve the tracking performance of the end-to-end 3D MOT by adaptively correlating spatiotemporal data. The novelty of this paper includes the following. (1) Different from nonintelligent fusion methods, this paper uses an efficiently adaptive spatial-guided fusion (SGFus) module for multimodal feature fusion. As a result, the 3D structural information obtained from point cloud data can provide additional spatial information as complementary information to the 2D texture information extracted from the image data, collaboratively facilitating and refining the perception information representation in the margin area. (2) This paper develops a spatiotemporal object-unique attention (STOUA) module that calculates the relational degree of each perceived object between two adjacent frames through attentional encoding. At the same time, an adaptive weighting strategy is used to further study the spatiotemporal correlation of unique objects, reducing the similarity among various objects and the differences across the same object. Experiments tested using the KITTI tracking benchmark show that the 3DSTAA tracker is highly competitive in both inference time and tracking performance compared with state-of-the-art (SOTA) methods. Our corresponding code will be released on the https://github.com/xf-zh.

Spatio-Temporal Bi-directional Cross-frame Memory for Distractor Filtering Point Cloud Single Object Tracking

Exploit Spatiotemporal Contextual Information for 3D Single Object Tracking Via Memory Networks

RASTMTrack: Robust and Adaptive Space-Time Memory Networks for Visual Tracking

Object tracking with 3D LIDAR via multi-task sparse learning

STTracker: Spatio-Temporal Tracker for 3D Single Object Tracking

Visual Object Tracking with Multi-Frame Distractor Suppression

Online Object Tracking Based on CNN with Spatial-Temporal Saliency Guided Sampling

Spatiotemporal adaptive attention 3D multiobject tracking for autonomous driving

A Dynamic 3D Multi-Object Tracking Method Based on Spatiotemporal Features

Distractor-Aware Fast Tracking Via Dynamic Convolutions and MOT Philosophy

Spatial-Temporal Aware Long-Term Object Tracking

Dynamic memory network with spatial-temporal feature fusion for visual tracking

You Don't Only Look Once - Constructing Spatial-Temporal Memory for Integrated 3D Object Detection and Tracking.

Distractor-aware discrimination learning for online multiple object tracking

Beyond 3D Siamese Tracking: A Motion-Centric Paradigm for 3D Single Object Tracking in Point Clouds

Discriminative Segmentation Tracking Using Dual Memory Banks

Spatio-Temporal Contextual Learning for Single Object Tracking on Point Clouds

MBPTrack: Improving 3D Point Cloud Tracking with Memory Networks and Box Priors.

Collaborative Tracking: Dynamically Fusing Short-Term Trackers and Long-Term Detector.

A Lightweight and Detector-Free 3D Single Object Tracker on Point Clouds

Spatio-temporal Interactive Fusion Based Visual Object Tracking Method