Abstract:The Visually Impaired People (VIP) have the difficulty in perceiving the accurate localization in their daily life. Developing an efficient algorithm to address the localization issues of the VIP is crucial. Visual Place Recognition (VPR) refers to using the image retrieval algorithms to determine the location of a query image in the database, which is promising to help the VIP solve their localization problems. However, the accuracy of VPR is directly affected by the changes of scene appearances such as illumination, seasons and viewpoints. Therefore, finding a method to extract robust image descriptors under the changes of scene appearance is one of the most critical tasks in current VPR research. In this paper, we propose a VPR approach to assist the localization and navigation of visually impaired pedestrians. The core of our proposal is a combination of multi-level descriptors by using appropriate descriptors: the whole image, local regions and key-points, aimed to enhance the robustness of VPR. The matching procedure between query images and database images includes three steps. Firstly, we obtain the Convolutional Neural Networks (CNN) features of the whole images from a pre-trained GoogLeNet, and the Euclidean distances between the query images and the database images are computed to determine the top 10 good matches. Secondly, local salient regions are detected from the top-10 best-matched images with Non-Maximum Suppression (NMS) to control the number of bounding boxes. Thirdly, we detect the SIFT key-points and extract the geodesc descriptors of the key-points, from the local salient region, and determine the top 1 among the top 10 good matches. In order to verify our approach, a comprehensive set of experiments has been conducted on dataset with challenging environmental changes, such as the GardensPointWalking dataset.

Business-Aware Visual Concept Discovery from Social Media for Multimodal Business Venue Recognition

Visual Place Recognition Based on Multilevel Descriptors for the Visually Impaired People

Category-Based Deep CCA for Fine-Grained Venue Discovery from Multimodal Data

Semantic-Based Location Recommendation With Multimodal Venue Semantics

Enabling the interpretability of pretrained venue representations using semantic categories

Large Scale Business Discovery from Street Level Imagery

Learning Bipartite Graph Matching for Robust Visual Localization.

Venue Prediction for Social Images by Exploiting Rich Temporal Patterns in LBSNs.

Abstract Venue Concept Detection From Location-Based Social Networks

Cross-modal Place Recognition in Image Databases using Event-based Sensors

Hierarchy-Dependent Cross-Platform Multi-View Feature Learning for Venue Category Prediction

Enhancing Micro-Video Venue Recognition via Multi-Modal and Multi-Granularity Object Relations

Multi-context Embedding Based Personalized Place Semantics Recognition.

Monocular Visual Place Recognition in LiDAR Maps via Cross-Modal State Space Model and Multi-View Matching

FM-Loc: Using Foundation Models for Improved Vision-based Localization

Scene Retrieval for Contextual Visual Mapping

Discovering Place-Informative Scenes and Objects Using Social Media Photos

ModaLink: Unifying Modalities for Efficient Image-to-PointCloud Place Recognition

BEVPlace: Learning LiDAR-based Place Recognition using Bird's Eye View Images

Intelligent Reference Curation for Visual Place Recognition via Bayesian Selective Fusion

Business Category Classification via Indistinctive Satellite Image Analysis Using Deep Learning