DETA: A Point-Based Tracker With Deformable Transformer and Task-Aligned Learning

Kai Yang,Haijun Zhang,Feng Gao,Jianyang Shi,Yanfeng Zhang,Q. M. Jonathan Wu
DOI: https://doi.org/10.1109/tmm.2022.3223213
IF: 7.3
2022-01-01
IEEE Transactions on Multimedia
Abstract:Current point-based trackers are usually implemented by the following two branches: a classification branch for predicting the target candidate locations and a regression branch for regressing the tracking box, which may lead to a spatial misalignment between the two tasks. Meanwhile, they ignore a meaningful exploration on how to define positive and negative samples during training and explicit border information for accurate box prediction. In this research, we investigate the key issues of point-based trackers and unlock their key limitations. First, we design a novel task-aligned component and a new loss function, named task-aligned loss, to learn the alignment of the classification and regression tasks. Second, we introduce a border alignment (BorderAlign) component in both the classification and regression branches to effectively exploit the border features of a tracking target. Third, we develop an adaptive training sample assignment (ATSA) to adaptively divide the positive and negative samples based on the statistical characteristics of the tracking object. Finally, a deformable transformer is developed to enhance the representations of search features and explore rich temporal contexts among video frames. Extensive experimental results demonstrate that the proposed tracker achieves state-of-the-art performance on six tracking benchmark datasets.
computer science, information systems,telecommunications, software engineering
What problem does this paper attempt to address?