Abstract:Visually Impaired (VI) people around the world have difficulties in socializing and traveling due to the limitation of traditional assistive tools. In recent years, practical assistance systems for scene text detection and recognition allow VI people to obtain text information from surrounding scenes. However, real-world scene text features complex background, low resolution, variable fonts as well as irregular arrangement which make it difficult to achieve robust scene text detection and recognition. In this paper, a scene text recognition system to help VI people is proposed. Firstly, we propose a high-performance neural network to detect and track objects, which is applied to specific scenes to obtain Regions of Interest (ROI). In order to achieve real-time detection, a light-weight deep neural network has been built using depth-wise separable convolutions that enables the system to be integrated into mobile devices with limited computational resources. Secondly, we train the neural network using the textural features to improve the precision of text detection. Our algorithm suppresses the effects of spatial transformation (including translation, scaling, rotation as well as other geometric transformations) based on the spatial transformer networks. Open-source optical character recognition (OCR) is used to train scene texts individually to improve the accuracy of text recognition. The interactive system eventually transfers the number and distance information of inbound buses to visually impaired people. Finally, a comprehensive set of experiments on several benchmark datasets demonstrates that our algorithm has achieved an extraordinary trade-off between precision and resource usage.

Perception Framework Through Real-Time Semantic Segmentation and Scene Recognition on a Wearable System for the Visually Impaired.

Robustifying Semantic Cognition of Traversability Across Wearable RGB-depth Cameras

Long-Range Traversability Awareness and Low-Lying Obstacle Negotiation with RealSense for the Visually Impaired

Unifying Terrain Awareness Through Real-Time Semantic Segmentation

An Environmental Perception and Navigational Assistance System for Visually Impaired Persons Based on Semantic Stixels and Sound Interaction

Semantic perception of curbs beyond traversability for real-world navigation assistance systems

Scene Text Detection and Recognition System for Visually Impaired People in Real World

Unifying Visual Localization and Scene Recognition for People with Visual Impairment

A Wearable Vision-To-Audio Sensory Substitution Device for Blind Assistance and the Correlated Neural Substrates

Unconstrained Face Detection and Recognition Based on Rgb-D Camera for the Visually Impaired

A Wearable Navigation Device for Visually Impaired People Based on the Real-Time Semantic Visual SLAM System

Unifying Terrain Awareness for the Visually Impaired through Real-Time Semantic Segmentation

Semantic scene understanding on mobile device with illumination invariance for the visually impaired

Sensing and Navigation of Wearable Assistance Cognitive Systems for the Visually Impaired

Panoptic Lintention Network: Towards Efficient Navigational Perception for the Visually Impaired

Wearable Vision Assistance System Based on Binocular Sensors for Visually Impaired Users

A Wearable Visually Impaired Assistive System Based on Semantic Vision SLAM for Grasping Operation

HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor

Wearable Travel Aid for Environment Perception and Navigation of Visually Impaired People

Can We Unify Perception and Localization in Assisted Navigation? An Indoor Semantic Visual Positioning System for Visually Impaired People

Intersection Perception Through Real-Time Semantic Segmentation to Assist Navigation of Visually Impaired Pedestrians