Graph Attention Network for Context-Aware Visual Tracking

Yanyan Shao,Dongyan Guo,Ying Cui,Zhenhua Wang,Liyan Zhang,Jianhua Zhang
DOI: https://doi.org/10.1109/TNNLS.2024.3442290
2024-09-25
Abstract:Siamese-network-based trackers convert the general object tracking as a similarity matching task between a template and a search region. Using convolutional feature cross correlation (Xcorr) for similarity matching, a large number of Siamese trackers are proposed and achieved great success. However, due to the predefined size of the target feature, these trackers suffer from either retaining much background information or losing important foreground information. Moreover, the global matching between the target and search region also largely neglects the part-level structural information and the contextual information of the target. To tackle the aforementioned obstacles, in this article, we propose a simple context-aware Siamese graph attention network, which establishes part-to-part correspondence between the Siamese branches with a complete bipartite graph. The object information from the template is propagated to the search region via a graph attention mechanism. With such a design, a target-aware template input is enabled to replace the prefixed template region, which can adaptively fit the size and aspect ratio variations in different objects. Based on it, we further construct a context-aware feature matching mechanism to embed both the target and the contextual information in the search region. Experiments on challenging benchmarks including GOT-10k, TrackingNet, LaSOT, VOT2020, and OTB-100 demonstrate that the proposed SiamGAT* outperforms many state-of-the-art trackers and achieves leading performance. Code is available at: https://git.io/SiamGAT.
What problem does this paper attempt to address?