Abstract:Visually Impaired (VI) people around the world have difficulties in socializing and traveling due to the limitation of traditional assistive tools. In recent years, practical assistance systems for scene text detection and recognition allow VI people to obtain text information from surrounding scenes. However, real-world scene text features complex background, low resolution, variable fonts as well as irregular arrangement which make it difficult to achieve robust scene text detection and recognition. In this paper, a scene text recognition system to help VI people is proposed. Firstly, we propose a high-performance neural network to detect and track objects, which is applied to specific scenes to obtain Regions of Interest (ROI). In order to achieve real-time detection, a light-weight deep neural network has been built using depth-wise separable convolutions that enables the system to be integrated into mobile devices with limited computational resources. Secondly, we train the neural network using the textural features to improve the precision of text detection. Our algorithm suppresses the effects of spatial transformation (including translation, scaling, rotation as well as other geometric transformations) based on the spatial transformer networks. Open-source optical character recognition (OCR) is used to train scene texts individually to improve the accuracy of text recognition. The interactive system eventually transfers the number and distance information of inbound buses to visually impaired people. Finally, a comprehensive set of experiments on several benchmark datasets demonstrates that our algorithm has achieved an extraordinary trade-off between precision and resource usage.

A Research on Video Text Tracking and Recognition

A new video text detection method.

A Novel Approach to Text Detection and Extraction from Videos by Discriminative Features and Density

A New Technique for Multi-Oriented Scene Text Line Detection and Tracking in Video

A method for text line detection in natural images

A New Method for Text Location in News Video Based on Ant Colony Algorithm

Video Text Tracking With a Spatio-Temporal Complementary Model

Scene Text Detection and Tracking in Video with Background Cues

Scene Text Detection and Recognition System for Visually Impaired People in Real World

Video text rediscovery: Predicting and tracking text across complex scenes

You Only Recognize Once: Towards Fast Video Text Spotting

Video text detection and segmentation for optical character recognition

Automatic video superimposed text detection based on Nonsubsampled Contourlet Transform

Tracking Based Multi-Orientation Scene Text Detection: A Unified Framework With Dynamic Programming.

Detecting both superimposed and scene text with multiple languages and multiple alignments in video

End-to-end video text detection with online tracking

A Video Text Detecting Method Based on Edge Detection and Line Features

Text Detection Using Delaunay Triangulation in Video Sequence

Effective Video Text Detection Using Line Features

Video Text Detection with Fully Convolutional Network and Tracking

Text Component Reconstruction for Tracking in Video.