Target-Aware Transformer for Satellite Video Object Tracking

Pujian Lai,Meili Zhang,Gong Cheng,Shengyang Li,Xiankai Huang,Junwei Han
DOI: https://doi.org/10.1109/tgrs.2023.3339658
IF: 8.2
2023-01-01
IEEE Transactions on Geoscience and Remote Sensing
Abstract:Recent years have witnessed the astonishing development of transformer-based paradigm in single object tracking (SOT) in generic videos. However, due to the fact that the targets of interest in satellite videos are small in size and weak in visual appearance, the advancements of transformer-based paradigm in satellite video object tracking are impeded. To alleviate this issue, a novel transformer-based recipe is proposed, which consists of a bi-direction propagation and fusion (Bi-PF) strategy and a target-aware enhancement (TAE) module. Concretely, we first adopt the Bi-PF strategy to make full use of multiscale information to generate discriminative representations of tracking targets. Then, the TAE module is employed to decouple an object query into content-aware embedding and spatial-aware embedding and produce a target prototype to help get high-quality content-aware embedding. It is worth mentioning that, different from the previous methods in satellite video tracking most of which evaluate their performance using only several videos, we conduct extensive experiments on the SatSOT dataset which consists of 105 videos. In particular, the proposed method achieves the success score of 45.6% and the precision score of 57.6%, surpassing the baseline method by 5.0% and 9.5%, respectively. The code will be released at https://github.com/laybebe/TATrans_SVOT .
What problem does this paper attempt to address?