Abstract:Convolutional neural networks (CNN) based tracking approaches have shown favorable performance in recent benchmarks. Nonetheless, the chosen CNN features are always pre-trained in different task and individual components in tracking systems are learned separately, thus the achieved tracking performance may be suboptimal. Besides, most of these trackers are not designed towards real-time applications because of their time-consuming feature extraction and complex optimization <a class="link-external link-http" href="http://details.In" rel="external noopener nofollow">this http URL</a> this paper, we propose an end-to-end framework to learn the convolutional features and perform the tracking process simultaneously, namely, a unified convolutional tracker (UCT). Specifically, The UCT treats feature extractor and tracking process both as convolution operation and trains them jointly, enabling learned CNN features are tightly coupled to tracking process. In online tracking, an efficient updating method is proposed by introducing peak-versus-noise ratio (PNR) criterion, and scale changes are handled efficiently by incorporating a scale branch into network. The proposed approach results in superior tracking performance, while maintaining real-time speed. The standard UCT and UCT-Lite can track generic objects at 41 FPS and 154 FPS without further optimization, respectively. Experiments are performed on four challenging benchmark tracking datasets: OTB2013, OTB2015, VOT2014 and VOT2015, and our method achieves state-of-the-art results on these benchmarks compared with other real-time trackers.

Learning Fully Convolutional Network for Visual Tracking with Multi-Layer Feature Fusion

Visual Tracking with Fully Convolutional Networks

Siamese Network with Multi-Scale Fusion Attention for Visual Tracking

Siamese Network Based Features Fusion For Adaptive Visual Tracking

Robust Visual Tracking with Deep Feature Fusion

Fully Convolutional Siamese Fusion Networks for Object Tracking

Adaptive feature fusion for visual object tracking

Visual Tracking Based on Multi-Feature Fusion

Multi-Task Hierarchical Feature Learning for Real-Time Visual Tracking

Research on Target Tracking Algorithm Based on Multi-Layer Convolution Feature Fusion

Hierarchical Convolutional Features for Visual Tracking

Multiple Object Tracking with Adaptive Multi-Features Fusion and Improved Learnable Graph Matching

CVTrack: Combined Convolutional Neural Network and Vision Transformer Fusion Model for Visual Tracking

Visual Object Tracking Based on Mutual Learning Between Cohort Multiscale Feature-Fusion Networks with Weighted Loss

Real-time tracking based on deep feature fusion

Dual Model Learning Combined with Multiple Feature Selection for Accurate Visual Tracking.

Robust Visual Tracking Via Collaborative and Reinforced Convolutional Feature Learning

Exploiting multi-scale hierarchical feature representation for visual tracking

Fast Multi-Object Tracking Using Convolutional Neural Networks with Tracklets Updating

UCT: Learning Unified Convolutional Networks for Real-time Visual Tracking

Dynamic memory network with spatial-temporal feature fusion for visual tracking