Abstract:Optical character recognition (OCR) is the process of acquiring text and layout information through analysis and recognition of text data image files. It is also a process to identify the geometric location and orientation of the texts and their symmetrical behavior. It usually consists of two steps: text detection and text recognition. Scene text recognition is a subfield of OCR that focuses on processing text in natural scenes, such as streets, billboards, license plates, etc. Unlike traditional document category photographs, it is a challenging task to use computer technology to locate and read text information in natural scenes. Imaging sequence recognition is a longstanding subject of research in the field of computer vision. Great progress has been made in this field; however, most models struggled to recognize text in images of complex scenes with high accuracy. This paper proposes a new pattern of text recognition based on the convolutional recurrent neural network (CRNN) as a solution to address this issue. It combines real-time scene text detection with differentiable binarization (DBNet) for text detection and segmentation, text direction classifier, and the Retinex algorithm for image enhancement. To evaluate the effectiveness of the proposed method, we performed experimental analysis of the proposed algorithm, and carried out simulation on complex scene image data based on existing literature data and also on several real datasets designed for a variety of nonstationary environments. Experimental results demonstrated that our proposed model performed better than the baseline methods on three benchmark datasets and achieved on-par performance with other approaches on existing datasets. This model can solve the problem that CRNN cannot identify text in complex and multi-oriented text scenes. Furthermore, it outperforms the original CRNN model with higher accuracy across a wider variety of application scenarios.

A Novel Short-Memory Sequence-Based Model for Variable-Length Reading Recognition of Multi-Type Digital Instruments in Industrial Scenarios

Scene Text Detection and Recognition System for Visually Impaired People in Real World

Human Identification by Means of Optoelectronic Reservoir Computing

A Mask RCNN based Automatic Reading Method for Pointer Meter

Variabletypography: artificial intelligence augmented reading experience

Investigation on Intelligent Recognition System of Instrument Based on Multi-step Convolution Neural Network

Coordinate Embedding Transformer Model for Optical Music Recognition on Monophonic Scores

A Convolutional Recurrent Neural-Network-Based Machine Learning for Scene Text Recognition Application

A Deep Learning Technology based OCR Framework for Recognition Handwritten Expression and Text

Read Pointer Meters in complex environments based on a Human-like Alignment and Recognition Algorithm

A Multitask Learning Approach for Chinese National Instruments Recognition and Timbre Space Regression

On the Use of Attention Mechanism in a Seq2Seq Based Approach for Off-Line Handwritten Digit String Recognition

On the Hidden Mystery of OCR in Large Multimodal Models

CC-OCR: A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy

OCR is All you need: Importing Multi-Modality into Image-based Defect Detection System

MTSTR: Multi-task learning for low-resolution scene text recognition via dual attention mechanism and its application in logistics industry

Design of recognition algorithm for multiclass digital display instrument based on convolution neural network

The Industrial Application of Artificial Intelligence-Based Optical Character Recognition in Modern Manufacturing Innovations

Efficient OCR for Building a Diverse Digital History

High-Performance Ocr On Packing Boxes In Industry Based On Deep Learning

A Feasible Framework for Arbitrary-Shaped Scene Text Recognition