LSDNet: A Lightweight Self-Attentional Distillation Network for Visual Place Recognition

Guohao Peng,Yifeng Huang,Heshan Li,Zhenyu Wu,Danwei Wang
DOI: https://doi.org/10.1109/IROS47612.2022.9982272
2022-01-01
Abstract:Visual Place Recognition (VPR) has become an indispensable capacity for mobile robots to operate in large-scale environments. Existing methods in this field mostly focus on exploring high-performance encoding strategies, while few attempts are devoted to lightweight models that balance performance and computational cost. In this work, we propose a Lightweight Self-attentional Distillation Network (LSDNet) aiming to obtain advantages of both performance and efficiency. (1) From a performance perspective, an attentional encoding strategy is proposed to integrate crucial information in the scene. It extends the NetVLAD architecture with a self-attention module to facilitate non-local information interaction between local features. Through further visual word vector rescaling, the final image representation can benefit from both non-local spatial integration and cluster-wise weighting. (2) From an efficiency perspective, LSDNet is built upon a lightweight backbone. To maintain comparable performance to large backbone models, a dual distillation strategy is introduced. It prompts LSDNet to learn both encoding patterns in the hidden space and feature distributions in the encoding space from the teacher model. Through distillation-augmented training, LSDNet is able to rival the teacher model and outperform SOTA global representations with the same lightweight backbone.
What problem does this paper attempt to address?