A Deep Temporal-Spectral-Spatial Anchor-Free Siamese Tracking Network for Hyperspectral Video Object Tracking
Zhenqi Liu,Yanfei Zhong,Guorui Ma,Xinyu Wang,Liangpei Zhang
DOI: https://doi.org/10.1109/tgrs.2024.3483072
IF: 8.2
2024-01-01
IEEE Transactions on Geoscience and Remote Sensing
Abstract:High spatial, high spectral, and high temporal (H³) information of the objects of interest can be provided by hyperspectral video, which makes it possible to track objects in complex scenarios. However, during motion, changes in the target’s appearance, background, and spectral information can degrade the performance of existing hyperspectral trackers due to insufficient training data. Consequently, this results in weak generalization of these trackers. In this paper, to solve the above problems, a deep temporal-spectral-spatial anchor-free Siamese tracking network for hyperspectral video object tracking, namely HA-Net, is proposed. In HA-Net, a Siamese spectral enhancement tracker module based on an RGB tracker (pseudo-color tracker) is designed, which uses the powerful feature expression capabilities of the deep network to learn more discriminative deep spectral features for identifying objects in complex scenarios. The pseudo-color tracker is introduced to solve the problem of model performance limitation due to insufficient training data. By introducing the temporal-spectral-spatial online discrimination learning module, the temporal-spectral-spatial information of the target can be dynamically modeled to adapt to new targets and the dynamic changes of targets. Benefiting from the double Siamese network architecture, the model can be effectively trained from scratch with less than 20,000 training samples. Online learning of temporal-spectral-spatial information for the target, particularly in cases of insufficient training data, can alleviate the issue of model degradation. This approach enhances the model’s robustness when tracking the target in complex scenes. In the 2021 IEEE WHISPERS Hyperspectral Object Tracking (HOT) Challenge, HA-Net obtained the best performance, with a distance precision (DP) score of 0.948 and an area under the curve (AUC) score of 0.688. The running speed is also 14 frames per second, which is superior to the existing hyperspectral object trackers for hyperspectral video. The source code is available at: https://github.com/zhenliuzhenqi/HOT.