Abstract:Scene text detection refers to locating text regions in a scene image and marking them out with text boxes. With the rapid development of the mobile Internet and the increasing popularity of mobile terminal devices such as smartphones, the research on scene text detection technology has been highly valued and widely applied. In recent years, with the rise of deep learning represented by convolutional neural networks, research on scene text detection has made new developments. However, scene text detection is still a very challenging task due to the following two factors. Firstly, images in natural scenes often have complex backgrounds, which can easily interfere with the detection process. Secondly, the text in natural scenes is very diverse, with horizontal, skewed, straight, and curved text, all of which may be present in the same scene. As convolutional neural networks extract features, the convolutional layer with limited perceptual field cannot model the global semantic information well. Therefore, this paper further proposes a scene text detection algorithm based on dual-branch feature extraction. This paper enlarges the receptive field by means of a residual correction branch (RCB), to obtain contextual information with a larger receptive field. At the same time, in order to improve the efficiency of using the features, a two-branch attentional feature fusion (TB-AFF) module is proposed based on FPN, to combine global and local attention to pinpoint text regions, enhance the sensitivity of the network to text regions, and accurately detect the text location in natural scenes. In this paper, several sets of comparative experiments were conducted and compared with the current mainstream text detection methods, all of which achieved better results, thus verifying the effectiveness of the improved proposed method.

A two-stage method for text line detection in historical documents

A method for text line detection in natural images

Robust text line detection in historical documents: learning and evaluation methods

Text Line Segmentation in Historical Document Images Using an Adaptive U-Net Architecture

General Detection-based Text Line Recognition

Page Layout Analysis System for Unconstrained Historic Documents

Artificial Text Detection with Multiple Training Strategies

Texts As Lines: Text Detection with Weak Supervision

Combining Morphological and Histogram based Text Line Segmentation in the OCR Context

Text Line Segmentation from Struck-out Handwritten Document Images

What's Wrong with the Bottom-up Methods in Arbitrary-shape Scene Text Detection

Arbitrary-shaped scene text detection by predicting distance map

Scene Text Detection with Fully Convolutional Neural Networks

STN-OCR: A single Neural Network for Text Detection and Text Recognition

Scene Text Detection Based on Two-Branch Feature Extraction

Scene Text Detection Based On Robust Stroke Width Transform And Deep Belief Network

A Deep Learning Approach for Robust, Multi-oriented, and Curved Text Detection

Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition

ReLaText: Exploiting Visual Relationships for Arbitrary-Shaped Scene Text Detection with Graph Convolutional Networks

LRANet: Towards Accurate and Efficient Scene Text Detection with Low-Rank Approximation Network