Adaptive feature fusion for visual object tracking
Shaochuan Zhao,Tianyang Xu,Xiao-Jun Wu,Xue-Feng Zhu
DOI: https://doi.org/10.1016/j.patcog.2020.107679
IF: 8
2021-03-01
Pattern Recognition
Abstract:<p>Recent advanced trackers, consisting of discriminative classification component and dedicated bounding box estimation, have achieved improved performance in the visual tracking community. The most essential factor for the development is the utilization of different Convolutional Neural Networks (CNNs), which significantly improves the model capacity via offline trained deep feature representations. Though powerful deep structures emphasize more semantic appearance through high dimensional latent variables, how to achieve effective feature adaptation in the online tracking stage has not been sufficiently considered yet. To this end, we argue the necessity of exploring hierarchical and complementary appearance descriptors from different convolutional layers to achieve online tracking adaptation. Therefore, in this paper, we propose an adaptive feature fusion mechanism, which can balance the detection granularities from shallow to deep convolutional layers. To be specific, the correlation between template and instance is employed to generate adaptive weights to achieve advanced saliency and discrimination. In addition, considering temporal appearance variation, the projection matrix for the multi-channel inputs is jointly updated with the correlation classifier to further enhance the robustness. The experimental results on four recent benchmarks, <em>i.e.</em>, OTB-2015, VOT2018, LaSOT and TrackingNet, demonstrate the effectiveness and robustness of the proposed method, with superior performance compared to the state-of-the-art approaches.</p>
computer science, artificial intelligence,engineering, electrical & electronic