TaCoTrack: Tracking Object with Temporal Context

Zhixuan Wang,Bo Wang
DOI: https://doi.org/10.1109/ccdc58219.2023.10327428
2023-01-01
Abstract:The performance of visual object tracking generally depends on the extracted information of continuous frames. However, existing trackers cannot leverage temporal contexts to extract enough information from frames and does not adapt well to various challenges. In this paper, we present a neat and efficient framework, TaCoTrak, which completely exploits temporal contexts for object tracking. The temporal context are employed in two perspectives, the fusion of features and the refinement of search-response features. Specifically, for feature fusion, a dynamic self-adaptive convolution, which provides the capability of spatial feature representation, is designed to fuse the features extracted from multiple input frames with temporal information. For search-response feature refinement, we construct a temporal convolution with weights and bias change with each input. The extensive experiments fully demonstrate the effective and robust performance of TaCoTrak.
What problem does this paper attempt to address?