Multi-modal multi-task feature fusion for RGBT tracking

Yujue Cai,Xiubao Sui,Guohua Gu
DOI: https://doi.org/10.1016/j.inffus.2023.101816
IF: 18.6
2023-05-06
Information Fusion
Abstract:RGBT tracking has received more and more attention in recent years, and in this paper, we propose a multi-task auxiliary learning framework for RGBT tracking. Specifically, we simplify the tracking task to an instance classification task and make it the primary task of the framework. We designed three auxiliary tasks and used a hard-parameter sharing approach to jointly train multiple tasks, hoping that the primary task would benefit from them. The three auxiliary tasks are contrastive instance discrimination, one-shot instance segmentation, and instance semantic matching. The contrastive instance discrimination method promotes the classification process of the primary task by constraining the features in the representation space. One-shot instance segmentation trains the network in a weakly supervised way to focus on more fine-grained features. In addition, in order to make the network pay more attention to the invariant features of instance target during tracking, we introduce a semantic matching task to alleviate the model drift problem caused by time change. Based on the results on three RGBT tracking benchmarks, the proposed framework is not inferior to the state-of-the-art trackers.
computer science, artificial intelligence, theory & methods
What problem does this paper attempt to address?