FAML-RT: Feature Alignment-Based Multi-Level Similarity Metric Learning Network for a Two-Stage Robust Tracker

Jiahao Nie,Zhekang Dong,Zhiwei He,Han Wu,Mingyu Gao
DOI: https://doi.org/10.1016/j.ins.2023.02.083
IF: 8.1
2023-01-01
Information Sciences
Abstract:Existing multi-stage trackers treat visual object tracking as a multiple feature extraction and similarity metric process. However, the similarity metric methods used in them are typically based on linear cross-correlation, ignoring the matching of detailed information. Moreover, the feature extraction operators (e.g., RoI align) lead to a sub-optimal feature representation for matching. In this paper, we propose a novel similarity metric method called feature alignment-based multi-level similarity metric learning network to address these issues. Technically, we elaborate a feature alignment module to extract the features, suppressing the useless background information that affects the matching. Subsequently, using the aligned features, we design a learnable multi-level similarity metric learning network to implement the matching for detailed information at the channel and spatial levels, which effectively guides an accurate and discriminative similarity score. By integrating the above components as second stage, a two-stage robust tracking method FAML-RT is presented. Extensive experiments on the challenging benchmarks OTB100, LaSOT and VOT2018 show that FAML-RT achieves a competitive performance against state-of-the-art methods, while running at a high speed of 60 fps. Furthermore, a series of ablation studies demonstrate the effectiveness of the proposed feature alignment-based multi-level metric learning network.
What problem does this paper attempt to address?