Abstract:The Visually Impaired People (VIP) have the difficulty in perceiving the accurate localization in their daily life. Developing an efficient algorithm to address the localization issues of the VIP is crucial. Visual Place Recognition (VPR) refers to using the image retrieval algorithms to determine the location of a query image in the database, which is promising to help the VIP solve their localization problems. However, the accuracy of VPR is directly affected by the changes of scene appearances such as illumination, seasons and viewpoints. Therefore, finding a method to extract robust image descriptors under the changes of scene appearance is one of the most critical tasks in current VPR research. In this paper, we propose a VPR approach to assist the localization and navigation of visually impaired pedestrians. The core of our proposal is a combination of multi-level descriptors by using appropriate descriptors: the whole image, local regions and key-points, aimed to enhance the robustness of VPR. The matching procedure between query images and database images includes three steps. Firstly, we obtain the Convolutional Neural Networks (CNN) features of the whole images from a pre-trained GoogLeNet, and the Euclidean distances between the query images and the database images are computed to determine the top 10 good matches. Secondly, local salient regions are detected from the top-10 best-matched images with Non-Maximum Suppression (NMS) to control the number of bounding boxes. Thirdly, we detect the SIFT key-points and extract the geodesc descriptors of the key-points, from the local salient region, and determine the top 1 among the top 10 good matches. In order to verify our approach, a comprehensive set of experiments has been conducted on dataset with challenging environmental changes, such as the GardensPointWalking dataset.

DMPCANet: A Low Dimensional Aggregation Network for Visual Place Recognition

Visual Place Recognition Based on Multilevel Descriptors for the Visually Impaired People

Attention-based Pyramid Aggregation Network for Visual Place Recognition

Explicit Feature Disentanglement for Visual Place Recognition Across Appearance Changes

Gicnet: global information capture network for visual place recognition

Ghost-dil-NetVLAD: A Lightweight Neural Network for Visual Place Recognition

Hybrid CNN-Transformer Features for Visual Place Recognition

MultiRes-NetVLAD: Augmenting Place Recognition Training with Low-Resolution Imagery

Efficient 3D Point Cloud Feature Learning for Large-Scale Place Recognition

Register assisted aggregation for Visual Place Recognition

Enhancing Visual Place Recognition Using Discrete Cosine Transform and Difference-Based Descriptors

AANet: Aggregation and Alignment Network with Semi-hard Positive Sample Mining for Hierarchical Place Recognition

CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition

Convolutional MLP orthogonal fusion of multiscale features for visual place recognition

VDNA-PR: Using General Dataset Representations for Robust Sequential Visual Place Recognition

Spatial Pyramid-Enhanced NetVLAD With Weighted Triplet Loss for Place Recognition

VLAD-BuFF: Burst-aware Fast Feature Aggregation for Visual Place Recognition

SVS-VPR: A Semantic Visual and Spatial Information-Based Hierarchical Visual Place Recognition for Autonomous Navigation in Challenging Environmental Conditions

PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition

STA-VPR: Spatio-temporal Alignment for Visual Place Recognition