Enhancing Visual Place Recognition Using Discrete Cosine Transform and Difference-Based Descriptors

Qieshi Zhang,Zhenyu Xu,Zhiyong Yang,Ziliang Ren,Shuai Yuan,Jun Cheng
DOI: https://doi.org/10.1109/tcsii.2024.3358982
2024-01-01
Abstract:Visual Place Recognition (VPR) is a popular technique in agent localization, which is used to identify visited places based on their visual appearance. However, visual appearance is easily distorted by environmental factors such as weather and season, causing a decline performance of recent methods. To alleviate the influence of these factors, a Discrete Cosine Transform (DCT)-Mask Net is introduced, exploiting mid-frequency information to enhance the appearance invariance of representations. To further improve, a Difference Net is employed to capture the change between frames and learn difference-based sequential descriptors. Besides, to improve the discriminability of features and prevent overfitting of the model, an improved loss function, namely Decorrelation and Regulation-triplet (DR-triplet), is proposed. Experimental results show our method is more robust to appearance changes than state-of-the-art methods, which demonstrates superior recognition performance across different seasons on the Nordland dataset and desired generalization capabilities under various weather conditions on the Virtual KITTI 2 dataset.
engineering, electrical & electronic
What problem does this paper attempt to address?