Visual Tracking Combining Attention and Feature Fusion Network Modulation

Xu Keying,Shu Ping,Bao Hua
DOI: https://doi.org/10.3788/lop202259.1210013
2022-01-01
Laser & Optoelectronics Progress
Abstract:The existing tracking algorithms for network modulation ignore high order feature information, so they are prone to drift when dealing with large scale changes and object deformations. An object tracking algorithm that combines the attention mechanism and feature fusion network modulation is proposed. First, an efficient selective kernel attention module is embedded in the feature extraction backbone network, so that the network pays more attention to the extraction of target feature information; second, a multiscale interactive network is used for the extracted features to fully mine the multiscale information in the layer, and high order feature information is fused to improve the ability of target representation, to adapt to the complex and changeable environment in the tracking process; finally, the pyramid modulation network is used to guide the test branch to learn the optimal intersection over union prediction to achieve an accurate estimation of the targets. Experimental results show that the proposed algorithm achieves more competitive results than other algorithms in tracking accuracy and success rate on VOT2018, OTB100, GOT10k, TrackingNet, and LaSOT visual tracking benchmarks.
What problem does this paper attempt to address?