FDTrack: A Dual-head Focus Tracking Network with Frequency Enhancement

Zhao Gao,Dongming Zhou,Jinde Cao,Yisong Liu,Qingqing Shan
DOI: https://doi.org/10.1109/jsen.2024.3506929
IF: 4.3
2024-01-01
IEEE Sensors Journal
Abstract:The RGB-T tracking approach combines the advantages of visible and thermal sensors to achieve accurate target tracking in complex scenarios. However, previous RGB-T trackers based on the self-attention mechanism overlook crucial high-frequency information (such as texture, edges, and colors) that is essential for object prediction. To address these challenges, we propose a frequency-enhanced dual-head focus tracking network (FDTrack) for RGB-T tracking. FDTrack comprises four main components: high-frequency feature enhancement (HFFE), wavelet multi-frequency interaction (WMF), autonomous modality prediction (AMP), and search focus preprocessing (SFP). HFFE refines the features from the ViT backbones within specific modalities by adaptively amplifying high-frequency features. In contrast, WMF facilitates communication between different frequency bands to enhance the interaction of RGB-T features from a frequency perspective. To improve tracking robustness under extreme scenes, AMP incorporates dual prediction heads and determines the final outcome through feature matching. SFP adjusts the convolution kernel size based on pixel-to-target distance and preprocesses the search region with Gaussian blur to reduce background clutter interference and emphasize the target. Extensive experimental results demonstrate that FDTrack achieves competitive performance compared to state-of-the-art algorithms across various datasets including RGBT210, RGBT234, and LasHeR, showcasing its cutting-edge capabilities in this field.
What problem does this paper attempt to address?