TGAN: A Simple Model Update Strategy for Visual Tracking Via Template-Guidance Attention Network.

Kai Yang,Haijun Zhang,Dongliang Zhou,Linlin Liu
DOI: https://doi.org/10.1016/j.neunet.2021.08.010
IF: 7.8
2021-01-01
Neural Networks
Abstract:Visual attention has been widely used in various fields of visual tasks in recent years. Recently, visual trackers based on probabilistic discriminative model prediction (PrDiMP) and Siamese box adaptive network (SiamBAN) have attracted much attention due to their excellent performance and high efficiency. However, the target template of the model in both the PrDiMP and SiamBAN is not updated online, and feature vectors of the template image and the search image are independent of each other in the IoU-Net and Siamese frameworks. In this research, we proposed a template-guidance attention network in both the IoU-Net (denoted as TGAN-I) and Siamese (denoted as TGAN-S) frameworks for visual tracking. TGAN-I and TGAN-S can comprehensively utilize the feature information of the template image and search image, and provide an implicit way to update the template. By utilizing a simple template update strategy, the TGAN-I and TGAN-S trackers can be more robust under certain challenging conditions such as occlusion and deformation. Besides, we introduce a channel and spatial attention module in feature maps of the template image and search image for adaptive feature refinement. Deformable convolutional networks are further used to enhance the model generalization capability in various transformations aspect ratios and scales of tracking targets. To verify the effectiveness of the proposed method, we evaluate the TGAN-I and TGAN-S trackers on six benchmarks and achieve state-of-the-art results. In particular, the TGAN-I method outperforms the strong baseline, PrDiMP, by 0.323 → 0.355 and 0.471 → 0.501 of EAO score on VOT2019 and VOT2016, respectively.
What problem does this paper attempt to address?