Learning Spatial-Channel Attention for Visual Tracking

Yingsen Zeng,Haiying Wang,Ting Lu
DOI: https://doi.org/10.1109/ICCChina.2019.8855908
2019-01-01
Abstract:Convolutional neural networks have an advantage of strong representation and has been widely applied to visual tracking. However, simply deepening network in pursuit of boosting performance is inappropriate for tracking because of its speed requirement. We leverage spatial attention and channel attention to enhance features of objects without much extra computational cost. The fused spatial-channel attention enables network to extract discriminative and robust features of targets or background. Furthermore, we propose inter-instance loss to make our tracker be aware of not only target-background classification but also instances classification across multi-domains. Extensive experiments on Object Tracking Benchmark (OTB) show that the proposed tracker obtains an Area-Under-Curve (AUC) score of 66.8% on OTB2015, outperforming most of the state-of-art trackers.
What problem does this paper attempt to address?