Abstract:In detection-based multi-object tracking (MOT), one challenging problem is to design a robust affinity model for data association. Moreover, since these approaches entirely rely on detection responses to locate targets, a strategy should be taken to deal with a detector’s defect. In this paper, we propose a robust online MOT tracking method that can handle these two issues effectively. We first present a novel affinity model by jointly learning more powerful feature representation and distance metric within a deep architecture. Specifically, we design a convolutional neural network to extract appearance cue tailored toward person Re-ID and a long short-term memory network to extract motion cue to encode dynamics of targets. Both the cues are then combined with a triplet loss function, which performs end-to-end deep metric learning to encode dependences across both cues automatically and thus generates fused features in embedding space to distinguish targets. To overcome the detector’s limitation, a trajectory estimation strategy is presented. We design a recurrent neural network-based Bayesian filtering module, which takes a hidden state of the above-mentioned LSTM network as an input and performs recursive prediction and update for explicitly estimating targets state. In this way, we can reconstruct trajectories by filling the gaps where no detections are present or adjusting the exact locations of trajectory where detections are imprecise. The experiments on the challenging MOT 2015 and 2016 datasets show very competitive results when comparing our method with the recent state-of-the-art batch and online tracking approaches. We achieve top one in terms of multiple objects tracking accuracy and multiple objects tracking precision among online methods on the MOT2016 dataset.

End-to-End Learning Deep CRF Models for Multi-Object Tracking Deep CRF Models

End-to-End Learning Deep CRF models for Multi-Object Tracking

Chained-Tracker: Chaining Paired Attentive Regression Results for End-to-End Joint Multiple-Object Detection and Tracking

A CRF-Based Framework for Tracklet Inactivation in Online Multi-Object Tracking

Collaborative Deep Reinforcement Learning for Multi-object Tracking

Learning a Proposal Classifier for Multiple Object Tracking

Online Multi-Object Tracking Based on Feature Representation and Bayesian Filtering Within a Deep Learning Architecture

FAMNet: Joint Learning of Feature, Affinity and Multi-dimensional Assignment for Online Multiple Object Tracking

MotionTrack: Learning Robust Short-term and Long-term Motions for Multi-Object Tracking

Deep Human-Interaction and Association by Graph-Based Learning for Multiple Object Tracking in the Wild

A New Architecture for Neural Enhanced Multiobject Tracking

Deep Association: End-to-end Graph-Based Learning for Multiple Object Tracking with Conv-Graph Neural Network.

Deep learning and multi-modal fusion for real-time multi-object tracking: Algorithms, challenges, datasets, and comparative study

FACT: Feature Adaptive Continual-learning Tracker for Multiple Object Tracking

End-to-end Recurrent Multi-Object Tracking and Trajectory Prediction with Relational Reasoning

Efficient Combination Graph Model Based on Conditional Random Field for Online Multi-Object Tracking

MOTRv2: Bootstrapping End-to-End Multi-Object Tracking by Pretrained Object Detectors

TR-MOT: Multi-Object Tracking by Reference

Joint Multi-Object Detection and Tracking with Camera-LiDAR Fusion for Autonomous Driving

Learning of Global Objective for Network Flow in Multi-Object Tracking

Learnable Graph Matching: Incorporating Graph Partitioning with Deep Feature Learning for Multiple Object Tracking