Siamese transformer RGBT tracking

Futian Wang,Wenqi Wang,Lei Liu,Chenglong Li,Jing Tang
DOI: https://doi.org/10.1007/s10489-023-04741-y
IF: 5.3
2023-07-28
Applied Intelligence
Abstract:Siamese-based RGBT trackers have attracted wide attention because of their high efficiency. However, there is a lack of an effective multimodal fusion module and information interaction between the search area and template area, which leads to poor performance of these siamese-based RGBT trackers. To solve this problem, inspire by the global information modeling capability of the transformer, we construct a siamese-based transformer RGBT tracker consisting of a single unified transformer module. Specifically, we propose a unified transformer fusion module to achieve feature extraction and global information interaction in the siamese RGBT tracker, i.e., the interaction between the search area and template area, the interaction between different modalities. It consists of self-attention and cross-attention, which are used to extract features and information interaction respectively. In addition, to alleviate the impact of multimodal fusion on the efficiency of template update in the tracking stage, we propose a feature-level template update strategy, which effectively improves tracking efficiency. To verify the effectiveness of our tracker, we evaluate it on five benchmark datasets including GTOT, RGBT210, RGBT234, LasHeR and VTUAV, and the results show that our tracker achieves excellent performance compared to the state-of-the-art methods.
computer science, artificial intelligence
What problem does this paper attempt to address?