Abstract:Visual tracking of multiple objects is an essential component for a perception system in autonomous driving vehicles. One of the favorable approaches is the tracking-by-detection paradigm, which links current detection hypotheses to previously estimated object trajectories (also known as tracks) by searching appearance or motion similarities between them. As this search operation is usually based on a very limited spatial or temporal locality, the association can fail in cases of motion noise or long-term occlusion. In this paper, we propose a novel tracking method that solves this problem by putting together information from both enlarged structural and temporal domain. For efficiency without loss of optimality, this approach is decomposed in to three stages, with each dealing with only one constrained association task, and thus, it follows the alternating optimization fashion. In our approach, detections are first assembled into small tracklets based on meta-measurements of object affinity. The association task for tracklets-to-tracks is solved by structural information based on a motion pattern between them. Here, we propose new rules to decouple the processing time from the tracklet length. Furthermore, constraints from temporal domain are introduced to recover objects, which are long-time disappearing due to failed detection or long-term occlusion. By putting together the heterogeneous domain information, our approach exhibits an improved state-of-the-art performance on standard benchmarks. With relatively little processing time, an online and real-time tracking is also permitted in our approach.

You Only Need Two Detectors to Achieve Multi-Modal 3D Multi-Object Tracking

You Only Need Two Detectors to Achieve Multi-Modal 3D Multi-Object Tracking

A Multi-Modal Fusion-Based 3D Multi-Object Tracking Framework with Joint Detection

Online Multi-Object Tracking from A Bird's-Eye View by Fusion of Millimeter-Wave Radar and Vision

Probabilistic 3D Multi-Modal, Multi-Object Tracking for Autonomous Driving

Object-Level Pseudo-3D Lifting for Distance-Aware Tracking

Chained-Tracker: Chaining Paired Attentive Regression Results for End-to-End Joint Multiple-Object Detection and Tracking

MMF-Track: Multi-modal Multi-level Fusion for 3D Single Object Tracking

A Dynamic 3D Multi-Object Tracking Method Based on Spatiotemporal Features

MCTrack: A Unified 3D Multi-Object Tracking Framework for Autonomous Driving

Multi-Object Tracking with Inter-Feedback Between Detection and Tracking

A Tracking-By-Detection Based 3D Multiple Object Tracking for Autonomous Driving

Multi-object tracking via deep feature fusion and association analysis

T2Track: Multi-Object Tracking by Associating Between Tracks

Three-Dimensional Multi-Object Tracking Based on Feature Fusion and Similarity Estimation Network

TEMI-MOT: Towards Efficient Multi-Modality Instance-Aware Feature Learning for 3D Multi-Object Tracking

MCCA-MOT: Multimodal Collaboration-Guided Cascade Association Network for 3D Multi-Object Tracking

Real-time Multi-Object Tracking Based on Bi-directional Matching

Online Multi-Object Tracking Using Joint Domain Information in Traffic Scenarios

ByteTrackV2: 2D and 3D Multi-Object Tracking by Associating Every Detection Box

Real-Time Online Multi-Object Tracking