HATFNet: Hierarchical Adaptive Trident Fusion Network for RGBT Tracking

Yanjie Zhao,Huicheng Lai,Guxue Gao
DOI: https://doi.org/10.1007/s10489-023-04755-6
IF: 5.3
2023-01-01
Applied Intelligence
Abstract:In recent years, RGBT tracking has received more and more attention from researchers due to its great potential in handling complex scenes and working around the clock. However, the hierarchical and multimodal features are not fully exploited in the existing algorithms, which affects the accuracy and performance of tracking. This paper proposes a novel hierarchical adaptive trident fusion network (HATFNet) to solve this problem, which leverages the complementary advantages of both hierarchical and multimodal features to achieve robust RGBT tracking. Specifically, we propose a hierarchical feature aggregation structure that integrates features from different layers to exploit their complementarity. This structure first unifies the number of channels and resolutions of different layers and then performs adaptive aggregation by a hierarchical feature aggregation (HFA) module. In addition, we design a multimodal feature fusion to fuse two modality features according to the reliability of the modalities, which takes full advantage of the positive effect of high-quality modalities. Finally, a trident fusion structure is used to integrate three features (fused, RGB and thermal infrared features) to enrich the feature representation and enhance the complementary learning between modalities. The results of many experiments on two large-scale benchmark datasets show that HATFNet has significant superiority compared to other latest RGB-T and RGB tracking methods.
What problem does this paper attempt to address?