Abstract:An essential need for accurate visual object tracking is to capture better correlations between the tracking target and the search region. However, the dominant Siamese-based trackers are limited to producing dense similarity maps at once via a cross-correlations operation, ignoring to remedy the contamination caused by erroneous or ambiguous matches. In this paper, we propose a novel tracker, termed neighborhood consensus constraint-based siamese tracker (NCSiam), which takes the idea of neighborhood consensus constraint to refine the produced correlation maps. The intuition behind our approach is that we can support the nearby erroneous or ambiguous matches by analyzing a larger context of the scene that contains a unique match. Specifically, we devise a 4D convolution-based multi-level similarity refinement (MLSR) strategy. Taking the primary similarity maps obtained from a cross-correlation as input, MLSR acquires reliable matches by analyzing neighborhood consensus patterns in 4D space, thus enhancing the discriminability between the tracking target and the distractors. Besides, traditional Siamese-based trackers directly perform classification and regression on similarity response maps which discard appearance or semantic information. Therefore, an appearance affinity decoder (AAD) is developed to take full advantage of the semantic information of the search region. To further improve performance, we design a task-specific disentanglement (TSD) module to decouple the learned representations into classification-specific and regression-specific embeddings. Extensive experiments are conducted on six challenging benchmarks, including GOT-10k, TrackingNet, LaSOT, UAV123, OTB2015, and VOT2020. The results demonstrate the effectiveness of our method. The code will be available at https://github.com/laybebe/NCSiam.

Discriminative and Robust Online Learning for Siamese Visual Tracking

Siamese Refine Polar Mask Prediction Network for Visual Tracking

Visual Tracking Jointly with Online and Offline Learning

Learning Motion-Perceive Siamese network for robust visual object tracking

Siamese Residual Network for Efficient Visual Tracking

Online Background Discriminative Learning for Satellite Video Object Tracking

Siamese-Based Attention Learning Networks for Robust Visual Object Tracking

Distractor-aware Siamese Networks for Visual Object Tracking

Residual Attention SiameseRPN for Visual Tracking

Deformable Siamese Attention Networks for Visual Object Tracking

Learning Localization-aware Target Confidence for Siamese Visual Tracking

Learning Deep Lucas-Kanade Siamese Network for Visual Tracking

SiamATL: Online Update of Siamese Tracking Network via Attentional Transfer Learning

Unsupervised Learning of Accurate Siamese Tracking

Real-time object tracking in the wild with Siamese network

Hybrid Online Visual Tracking of Non-rigid Objects

Adaptive Siamese Tracking with a Compact Latent Network

R-SiamNet: ROI-Align Pooling Baesd Siamese Network for Object Tracking

NCSiam: Reliable Matching Via Neighborhood Consensus for Siamese-Based Object Tracking.

Learning Temporal-Correlated and Channel- Decorrelated Siamese Networks for Visual Tracking

SiamRCR: Reciprocal Classification and Regression for Visual Object Tracking