Learning Fine-Grained Similarity Matching Networks for Visual Tracking

Dawei Zhang,Zhonglong Zheng,Xiaowei He,Liu Su,Liyuan Chen
DOI: https://doi.org/10.1145/3372278.3390729
2020-01-01
Abstract:Recently, siamese trackers have been increasingly popular in visual tracking community. Despite great success, it is still difficult to perform robust tracking in various challenging scenarios. In this paper, we propose a novel similarity matching network, that effectively extracts fine-grained semantic features by adding a Classification branch and a Category-Aware module into the classical Siamese framework (CCASiam). More specifically, the supervision module can fully utilize the class information to obtain a loss for classification and the whole network performs tracking loss, so that the network can extract more discriminative features for each specific target. During online tracking, the classification branch is removed and the category-aware module is designed to guide the selection of target-active features using a ridge regression network, which avoids unnecessary calculations and over-fitting. Furthermore, we introduce different types of attention mechanisms to selectively emphasize important semantic information. Due to the fine-grained and category-aware features, CCASiam can perform high performance tracking efficiently. Extensive experimental results on several tracking benchmarks, show that the proposed tracker obtains the state-of-the-art performance with a real-time speed.
What problem does this paper attempt to address?