Scale Space Tracker with Multiple Features

Jining Bao,Yunzhou Zhang,Shangdong Zhu
DOI: https://doi.org/10.1007/s11042-022-13449-z
IF: 2.577
2022-01-01
Multimedia Tools and Applications
Abstract:Object tracking in videos has been a hot research for decades. Many approaches have been applied to improve the visual tracking, a challenging task in computer vision. Compared with the state-of-the-art methods, correlation filters have achieved more significant performance in visual object tracking. However, their flexibilities in the robust scale estimation are not very well. In this paper, we improve the performance of tracking with high discrimination power and explore an energy-efficient approach to design a simple superior tracker. First, instead of one simple feature extraction, we utilize multi-feature channels from the color space and convolutional layers, respectively, and establish a corresponding weighted formulation to fuse multiple features. Through the optimization, it can effectively obtain the latest position estimation of target object. Furthermore, the scale space correlation filter is investigated by the tracking-by-detection structure to distinguish the scale variation of the target object according to the updating position estimation. Additionally, we employ fusion approach to merge the multi-channel response maps to obtain an optimal tracking result, which ensures that our model can supply sufficient tracking information. Compared with the existing tracking approaches, we reduce the computation complexity. On the OTB-dataset, our tracker significantly improves the baseline, with a gain of 3.4% in the experimental evaluation. Both quantitative and qualitative evaluations are implemented on multiple benchmark sequences to demonstrate that the effectiveness of our proposed algorithm outperforms the state-of-the-art approaches.
What problem does this paper attempt to address?