Abstract:Multi-object tracking (MOT) in the scenario of low-frame-rate videos is a promising solution to better meet the computing, storage, and transmitting bandwidth resource constraints of edge devices. Tracking with a low frame rate poses particular challenges in the association stage as objects in two successive frames typically exhibit much quicker variations in locations, velocities, appearances, and visibilities than those in normal frame rates. In this paper, we observe severe performance degeneration of many existing association strategies caused by such variations. Though optical-flow-based methods like CenterTrack can handle the large displacement to some extent due to their large receptive field, the temporally local nature makes them fail to give reliable displacement estimations of objects that newly appear in the current frame (i.e., not visible in the previous frame). To overcome the local nature of optical-flow-based methods, we propose an online tracking method by extending the CenterTrack architecture with a new head, named APP, to recognize unreliable displacement estimations. Further, to capture the fine-grained and private unreliability of each displacement estimation, we extend the binary APP predictions to displacement uncertainties. To this end, we reformulate the displacement estimation task via Bayesian deep learning tools. With APP predictions, we propose to conduct association in a multi-stage manner where vision cues or historical motion cues are leveraged in the corresponding stage. By rethinking the commonly used bipartite matching algorithms, we equip the proposed multi-stage association policy with a hybrid matching strategy conditioned on displacement uncertainties. Our method shows robustness in preserving identities in low-frame-rate video sequences. Experimental results on public datasets in various low-frame-rate settings demonstrate the advantages of the proposed method.

Multi-object tracking via discriminative appearance modeling.

Special Issue on Visual Tracking

Supplementary Material: Quasi-Dense Similarity Learning for Multiple Object Tracking

Multi-object Tracking Via MHT with Multiple Information Fusion in Surveillance Video

APPTracker Plus : Displacement Uncertainty for Occlusion Handling in Low-Frame-Rate Multiple Object Tracking

Multi-cue Based Multi-target Tracking with Boosted MHT.

MAT: Motion-Aware Multi-Object Tracking

Multi-object tracking via deep feature fusion and association analysis

MSA-MOT: Multi-Stage Association for 3D Multimodality Multi-Object Tracking

Multihuman Tracking Based on a Spatial–Temporal Appearance Match

Instance Segmentation Enabled Hybrid Data Association and Discriminative Hashing for Online Multi-Object Tracking

Multiple object tracking with appearance feature prediction and similarity fusion

Object Tracking with Multi-View Support Vector Machines.

Appearance Guidance Attention for Multi-Object Tracking

MM-Tracker: Visual Tracking with A Multi-Task Model Integrating Detection and Differentiating Feature Extraction

Object Tracking with Hierarchical Multiview Learning

Hierarchical data association and depth-invariant appearance model for indoor multiple objects tracking

Multitarget Tracking Using Multifeature Model with Acceleration Feature

Multi-object Tracking by Expanding Long-Tracklets

3D Multi-Object Tracking in Point Clouds Based on Prediction Confidence-Guided Data Association

MMF-Track: Multi-modal Multi-level Fusion for 3D Single Object Tracking