Abstract:Here we address a challenging problem: recognizing multiple text sequences from an image by pure end-to-end learning. It is twofold: 1) Multiple text sequences recognition. Each image may contain multiple text sequences of different content, location and orientation, and we try to recognize all the text sequences contained in the image. 2) Pure end-to-end (PEE) <a class="link-external link-http" href="http://learning.We" rel="external noopener nofollow">this http URL</a> solve the problem in a pure end-to-end learning way where each training image is labeled by only text transcripts of all contained sequences, without any geometric annotations. Most existing works recognize multiple text sequences from an image in a non-end-to-end (NEE) or quasi-end-to-end (QEE) way, in which each image is trained with both text transcripts and text <a class="link-external link-http" href="http://locations.Only" rel="external noopener nofollow">this http URL</a> recently, a PEE method was proposed to recognize text sequences from an image where the text sequence was split to several lines in the image. However, it cannot be directly applied to recognizing multiple text sequences from an image. So in this paper, we propose a pure end-to-end learning method to recognize multiple text sequences from an image. Our method directly learns multiple sequences of probability distribution conditioned on each input image, and outputs multiple text transcripts with a well-designed decoding <a class="link-external link-http" href="http://strategy.To" rel="external noopener nofollow">this http URL</a> evaluate the proposed method, we constructed several datasets mainly based on an existing public dataset andtwo real application scenarios. Experimental results show that the proposed method can effectively recognize multiple text sequences from images, and outperforms CTC-based and attention-based baseline methods.

Recognizing Multiple Text Sequences from an Image by Pure End-to-End Learning

Towards Pure End-to-End Learning for Recognizing Multiple Text Sequences from an Image

Cross-Lingual Text Image Recognition Via Multi-Task Sequence to Sequence Learning.

Learn, Imagine and Create: Text-to-Image Generation from Prior Knowledge.

End-To-End Chinese Text Recognition

An end-to-end model for multi-view scene text recognition

Chinese Image Text Recognition with BLSTM-CTC: A Segmentation-Free Method.

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

SEE: Towards Semi-Supervised End-to-End Scene Text Recognition

Towards End-to-End Text Spotting in Natural Scenes

Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting

Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes

Towards End-to-end Text Spotting with Convolutional Recurrent Neural Networks

C V ] 9 M ay 2 01 8 Edit Probability for Scene Text Recognition

Scene Text Detection via Holistic, Multi-Channel Prediction

An approach for handwritten Chinese text recognition unifying character segmentation and recognition

TextNet: Irregular Text Reading from Images with an End-to-End Trainable Network.

E2E-MLT - An Unconstrained End-to-End Method for Multi-language Scene Text

Scene Text Recognition from Two-Dimensional Perspective

MTSTR: Multi-task learning for low-resolution scene text recognition via dual attention mechanism and its application in logistics industry

A Multiplexed Network for End-to-End, Multilingual OCR