M5L: Multi-Modal Multi-Margin Metric Learning for RGBT Tracking.

Zhengzheng Tu,Chun Lin,Wei Zhao,Chenglong Li,Jin Tang
DOI: https://doi.org/10.1109/tip.2021.3125504
IF: 10.6
2022-01-01
IEEE Transactions on Image Processing
Abstract:Classifying hard samples in the course of RGBT tracking is a quite challenging problem. Existing methods only focus on enlarging the boundary between positive and negative samples, but ignore the relations of multilevel hard samples, which are crucial for the robustness of hard sample classification. To handle this problem, we propose a novel Multi-Modal Multi-Margin Metric Learning framework named M5L for RGBT tracking. In particular, we divided all samples into four parts including normal positive, normal negative, hard positive and hard negative ones, and aim to leverage their relations to improve the robustness of feature embeddings, e.g., normal positive samples are closer to the ground truth than hard positive ones. To this end, we design a multi-modal multi-margin structural loss to preserve the relations of multilevel hard samples in the training stage. In addition, we introduce an attention-based fusion module to achieve quality-aware integration of different source data. Extensive experiments on large-scale datasets testify that our framework clearly improves the tracking performance and performs favorably the state-of-the-art RGBT trackers.
What problem does this paper attempt to address?