Abstract:<p>Recent advanced trackers, consisting of discriminative classification component and dedicated bounding box estimation, have achieved improved performance in the visual tracking community. The most essential factor for the development is the utilization of different Convolutional Neural Networks (CNNs), which significantly improves the model capacity via offline trained deep feature representations. Though powerful deep structures emphasize more semantic appearance through high dimensional latent variables, how to achieve effective feature adaptation in the online tracking stage has not been sufficiently considered yet. To this end, we argue the necessity of exploring hierarchical and complementary appearance descriptors from different convolutional layers to achieve online tracking adaptation. Therefore, in this paper, we propose an adaptive feature fusion mechanism, which can balance the detection granularities from shallow to deep convolutional layers. To be specific, the correlation between template and instance is employed to generate adaptive weights to achieve advanced saliency and discrimination. In addition, considering temporal appearance variation, the projection matrix for the multi-channel inputs is jointly updated with the correlation classifier to further enhance the robustness. The experimental results on four recent benchmarks, <em>i.e.</em>, OTB-2015, VOT2018, LaSOT and TrackingNet, demonstrate the effectiveness and robustness of the proposed method, with superior performance compared to the state-of-the-art approaches.</p>

Visual Object Tracking Based on Mutual Learning Between Cohort Multiscale Feature-Fusion Networks with Weighted Loss

Online Scale Adaptive Visual Tracking Based on Multilayer Convolutional Features

Learning Fully Convolutional Network for Visual Tracking with Multi-Layer Feature Fusion

Visual Tracking Based on Multi-Feature Fusion

Deep Mutual Learning for Visual Tracking.

Deep mutual learning for visual object tracking

Research on Target Tracking Algorithm Based on Multi-Layer Convolution Feature Fusion

Adaptive feature fusion for visual object tracking

Convolutional Neural Networks Based Scale-Adaptive Kernelized Correlation Filter For Robust Visual Object Tracking

Robust Visual Tracking with Deep Feature Fusion

Siamese Network Based Features Fusion For Adaptive Visual Tracking

Exploiting multi-scale hierarchical feature representation for visual tracking

Real-time tracking based on deep feature fusion

Fast Multi-Object Tracking Using Convolutional Neural Networks with Tracklets Updating

Lightweight Deep Neural Network for Real-Time Visual Tracking with Mutual Learning

Multiple Object Tracking with Adaptive Multi-Features Fusion and Improved Learnable Graph Matching

One-shot Multi-Object Tracking Using CNN-based Networks with Spatial-Channel Attention Mechanism

Mutual Learning and Feature Fusion Siamese Networks for Visual Object Tracking

An Improved C-COT Based Visual Tracking Scheme to Weighted Fusion of Diverse Features

Learning Compact Target-Oriented Feature Representations for Visual Tracking

Learning a Dynamic Feature Fusion Tracker for Object Tracking