Abstract:Siamese trackers learn the appearance model of the target in the first frame and then exploit the model to locate the target in the subsequent frames. Meanwhile, the appearance model remains unchanged in the subsequent frames. Due to the powerful feature extraction capability of the deep convolutional neural networks, Siamese trackers achieve advanced performance. However, due to the non-update of the appearance model and the changing appearance of the target, the problem of tracking drift occurs frequently, especially in the background clutters scenarios. In order to tackle this issue, we propose a motion model and a discriminative model. Firstly, the motion model of the target is constructed to determine whether the tracking drift occurs or not since the position of the target predicted by the motion model is smooth in timing but the position of the target predicted by the Siamese tracker may be not smooth. In this case, the temporal information is utilized to supplement the Siamese tracker which only employs the spatial information. Secondly, the discriminative model is learned to determine the final position of the target when the tracking drift happens. Finally, a flexible model update strategy of the discriminative model is presented. In order to demonstrate the generality of the proposed method, we apply it for two famous Siamese trackers, SiamFC and SiamRPN_DW. Extensive experiments on OTB2013, OTB2015, VOT2016, VOT2019 and GOT-10k benchmarks demonstrate that the proposed trackers outperform the baseline trackers and achieve the state-of-the-art performance, especially in the background clutters scenarios. To the best of our knowledge, we are the first time to propose motion guided Siamese trackers. Moreover, We can release our code to encourage more researches in this direction.

Progressive Unsupervised Learning for Visual Object Tracking

Unsupervised Deep Representation Learning for Real-Time Tracking

Unsupervised Learning of Accurate Siamese Tracking

Unsupervised Deep Tracking

Learning to Track Objects from Unlabeled Videos.

Online Object Tracking Based on CNN with Spatial-Temporal Saliency Guided Sampling

Online visual tracking via background-aware Siamese networks

Long-Term Visual Object Tracking Via Continual Learning

Discriminative and Robust Online Learning for Siamese Visual Tracking

Learning Dynamic Siamese Network For Visual Object Tracking

Progressive Multi-Stage Learning for Discriminative Tracking

Learning a Visual Tracker from a Single Movie Without Annotation

Motion Guided Siamese Trackers for Visual Tracking

Online Background Discriminative Learning for Satellite Video Object Tracking

Supplementary Material Progressive Unsupervised Learning for Visual Object Tracking

Progressive Perception Learning for Distribution Modulation in Siamese Tracking.

Visual Tracking with Semi-Supervised Online Weighted Multiple Instance Learning

Single Online Visual Object Tracking with Enhanced Tracking and Detection Learning

Siamese Residual Network for Efficient Visual Tracking

Visual Tracking Jointly with Online and Offline Learning

Robust Tracking Via Fully Exploring Background Prior Knowledge