Preterm birth in multiple pregnancy: a glimmer of hope?

N. Chescheir

DOI: https://doi.org/10.1016/S0140-6736(13)61540-8

IF: 202.731

2013-10-19

The Lancet

Abstract:

What problem does this paper attempt to address?

Renal Function in Compensated Hepatic Cirrhosis: Effects of an Amino Acid Infusion and Relationship with Nitric Acid

A. Rodriquez,A. Martin,J. A. Oterino,I. Blanco,M. Jiménez,A. Perez,J. M. Novoa

DOI: https://doi.org/10.1159/000016942

2000-03-31

Digestive Diseases

Abstract:Aims: In order to assess the possible participation of nitric oxide (NO) in renal function during compensated hepatic cirrhosis, we studied renal function, the plasma and urinary levels of cGMP and the concentration of nitrates and nitrites, as markers of NO synthesis in blood and urine, in 10 patients with Child A hepatic cirrhosis as compared with 10 control subjects, both under basal conditions and during stimulation (amino acid-induced glomerular hyperfiltration). Methods: To study renal function, the glomerular filtration rate (GFR), effective renal plasma flow (ERPF), renal functional reserve (RFR), renal venous resistance (RVR) and the filtration fraction (FF) were measured. Renin and aldosterone levels were determined to assess the possible involvement of these compounds in the renin-angiotensin-aldosterone axis. Results: GFR and ERPF were significantly lower in the patients with cirrhosis than in the controls (mean GFR: 82±12.3 vs. 105±15 ml/min, p = 0.01; ERPF 452±86 vs. 543±56 ml/min, p = 0.002). The RFR value was similar in both groups. In the basal situation cGMP levels were higher in plasma and urine in patients with cirrhosis than in the controls (plasma cGMP in cirrhosis 8.4±2.4 vs. 4.2±3.5 pmol/ml; urine cGMP in cirrhosis 1.2±2.1 vs. 0.68±0.1 pmol/ml). The NO levels were also higher in plasma and urine in patients with cirrhosis vs. controls (plasma NO in cirrhosis 45.5±9.2 vs. 30.3±1.2 μmol/l; urinary NO in cirrhosis 6.2±1.3 vs. 3.1±2.3 μmol/ml). In both groups the amino acid perfusion increased GFR, ERPF, cGMP and NO levels in plasma and urine. In the patients with cirrhosis the RVR decreased significantly during perfusion and no noteworthy changes in FF were observed. The GFR values observed during amino acid perfusion were similar in patients with cirrhosis and portal hypertension to those observed in the controls (27.2±12 vs. 25.3±16%). However, the changes induced the ERPF were more marked in patients with cirrhosis (cirrhosis 35.3±15 vs. 22.2±13%, p = 0.02). Conclusions: The present findings point to certain alterations in renal function in patients with hepatic cirrhosis and portal hypertension without ascitis, a clear difference being visible between the ERPF and GFR following amino acid-induced stimulation. The significant elevation in cGMP and NO levels in plasma and urine implies a maintained vasodilatory action that may at least partly compensate the vasoconstrictor effects of angiotensin II.
A Convolutional Recurrent Neural-Network-Based Machine Learning for Scene Text Recognition Application

Yiyi Liu,Yuxin Wang,Hongjian Shi

DOI: https://doi.org/10.3390/sym15040849

2023-04-03

Symmetry

Abstract:Optical character recognition (OCR) is the process of acquiring text and layout information through analysis and recognition of text data image files. It is also a process to identify the geometric location and orientation of the texts and their symmetrical behavior. It usually consists of two steps: text detection and text recognition. Scene text recognition is a subfield of OCR that focuses on processing text in natural scenes, such as streets, billboards, license plates, etc. Unlike traditional document category photographs, it is a challenging task to use computer technology to locate and read text information in natural scenes. Imaging sequence recognition is a longstanding subject of research in the field of computer vision. Great progress has been made in this field; however, most models struggled to recognize text in images of complex scenes with high accuracy. This paper proposes a new pattern of text recognition based on the convolutional recurrent neural network (CRNN) as a solution to address this issue. It combines real-time scene text detection with differentiable binarization (DBNet) for text detection and segmentation, text direction classifier, and the Retinex algorithm for image enhancement. To evaluate the effectiveness of the proposed method, we performed experimental analysis of the proposed algorithm, and carried out simulation on complex scene image data based on existing literature data and also on several real datasets designed for a variety of nonstationary environments. Experimental results demonstrated that our proposed model performed better than the baseline methods on three benchmark datasets and achieved on-par performance with other approaches on existing datasets. This model can solve the problem that CRNN cannot identify text in complex and multi-oriented text scenes. Furthermore, it outperforms the original CRNN model with higher accuracy across a wider variety of application scenarios.

multidisciplinary sciences
A Text-Context-Aware CNN Network for Multi-oriented and Multi-language Scene Text Detection.

Yao Xiao,Minglong Xue,Tong Lu,Yirui Wu,Shivakumara Palaiahnakote

DOI: https://doi.org/10.1109/ICDAR.2019.00116

2019-01-01

Abstract:The existing deep learning based state-of-theart scene text detection methods treat scene texts a type of general objects, or segment text regions directly. The latter category achieves remarkable detection results on arbitraryorientation and large aspect ratios of scene texts based on instance segmentation algorithms. However, due to the lack of context information with consideration of scene text unique characteristics, directly applying instance segmentation to text detection task is prone to result in low accuracy, especially producing false positive detection results. To ease this problem, we propose a novel text-context-aware scene text detection CNN structure, which appropriately encodes channel and spatial attention information to construct context-aware and discriminative feature map for multi-oriented and multi-language text detection tasks. With high representation ability of textcontext-aware feature map, the proposed instance segmentation based method can not only robustly detect multi-oriented and multi-language text from natural scene images, but also produce better text detection results by greatly reducing false positives. Experiments on ICDAR2015 and ICDAR2017-MLT datasets show that the proposed method has achieved superior performances in precision, recall and F-measure than most of the existing studies.
MASTER: Multi-Aspect Non-local Network for Scene Text Recognition

Ning Lu,Wenwen Yu,Xianbiao Qi,Yihao Chen,Ping Gong,Rong Xiao,Xiang Bai

DOI: https://doi.org/10.1016/j.patcog.2021.107980

2021-04-11

Abstract:Attention-based scene text recognizers have gained huge success, which leverages a more compact intermediate representation to learn 1d- or 2d- attention by a RNN-based encoder-decoder architecture. However, such methods suffer from attention-drift problem because high similarity among encoded features leads to attention confusion under the RNN-based local attention mechanism. Moreover, RNN-based methods have low efficiency due to poor parallelization. To overcome these problems, we propose the MASTER, a self-attention based scene text recognizer that (1) not only encodes the input-output attention but also learns self-attention which encodes feature-feature and target-target relationships inside the encoder and decoder and (2) learns a more powerful and robust intermediate representation to spatial distortion, and (3) owns a great training efficiency because of high training parallelization and a high-speed inference because of an efficient memory-cache mechanism. Extensive experiments on various benchmarks demonstrate the superior performance of our MASTER on both regular and irregular scene text. Pytorch code can be found at <a class="link-external link-https" href="https://github.com/wenwenyu/MASTER-pytorch" rel="external noopener nofollow">this https URL</a>, and Tensorflow code can be found at <a class="link-external link-https" href="https://github.com/jiangxiluning/MASTER-TF" rel="external noopener nofollow">this https URL</a>.

Computer Vision and Pattern Recognition
CORE-Text: Improving Scene Text Detection with Contrastive Relational Reasoning

Jingyang Lin,Yingwei Pan,Rongfeng Lai,Xuehang Yang,Hongyang Chao,Ting Yao

DOI: https://doi.org/10.48550/arXiv.2112.07513

2021-12-14

Computer Vision and Pattern Recognition

Abstract:Localizing text instances in natural scenes is regarded as a fundamental challenge in computer vision. Nevertheless, owing to the extremely varied aspect ratios and scales of text instances in real scenes, most conventional text detectors suffer from the sub-text problem that only localizes the fragments of text instance (i.e., sub-texts). In this work, we quantitatively analyze the sub-text problem and present a simple yet effective design, COntrastive RElation (CORE) module, to mitigate that issue. CORE first leverages a vanilla relation block to model the relations among all text proposals (sub-texts of multiple text instances) and further enhances relational reasoning via instance-level sub-text discrimination in a contrastive manner. Such way naturally learns instance-aware representations of text proposals and thus facilitates scene text detection. We integrate the CORE module into a two-stage text detector of Mask R-CNN and devise our text detector CORE-Text. Extensive experiments on four benchmarks demonstrate the superiority of CORE-Text. Code is available: \url{https://github.com/jylins/CORE-Text}.
Character Region Awareness Network for Scene Text Recognition

Mingyu Shang,Jie Gao,Jun Sun

DOI: https://doi.org/10.1109/icme46284.2020.9102785

2020-01-01

Abstract:Recognizing text in natural scenes is still a very challenging task, due to arbitrary shapes, varying fonts, complex backgrounds and so on. Recently, some recognizers utilize Spatial Transform Network (STN) to rectify irregular text instances and achieve promising results. However, their robustness and accuracy are still limited, since rectification performance can be easily degraded by challenging samples. To tackle this issue, we propose a simple yet effective two-dimensional (2D) character attention module, which can enhance foreground text instances via character region awareness. By incorporating this with existing rectification pipeline, we build a novel scene text recognizer named Character Region Awareness Network (CRAN). Extensive experiments demonstrate that our CRAN outperforms previous methods nearly on all benchmarks of both regular and irregular text, particularly on SVT (+2.0%), SVTP (+1.5%) and CUTE80 (+2.1%).
Multi-oriented Scene Text Detection via Corner Localization and Region Segmentation

Pengyuan Lyu,Cong Yao,Wenhao Wu,Shuicheng Yan,Xiang Bai

DOI: https://doi.org/10.1109/cvpr.2018.00788

2018-06-01

Abstract:Previous deep learning based state-of-the-art scene text detection methods can be roughly classified into two categories. The first category treats scene text as a type of general objects and follows general object detection paradigm to localize scene text by regressing the text box locations, but troubled by the arbitrary-orientation and large aspect ratios of scene text. The second one segments text regions directly, but mostly needs complex post processing. In this paper, we present a method that combines the ideas of the two types of methods while avoiding their shortcomings. We propose to detect scene text by localizing corner points of text bounding boxes and segmenting text regions in relative positions. In inference stage, candidate boxes are generated by sampling and grouping corner points, which are further scored by segmentation maps and suppressed by NMS. Compared with previous methods, our method can handle long oriented text naturally and doesn't need complex post processing. The experiments on ICDAR2013, ICDAR2015, MSRA-TD500, MLT and COCO-Text demonstrate that the proposed algorithm achieves better or comparable results in both accuracy and efficiency. Based on VGG16, it achieves an F-measure of 84.3% on ICDAR2015 and 81.5% on MSRA-TD500.
Scene Text Recognition with Sliding Convolutional Character Models

Fei Yin,Yi-Chao Wu,Xu-Yao Zhang,Cheng-Lin Liu

DOI: https://doi.org/10.48550/arXiv.1709.01727

2017-09-06

Abstract:Scene text recognition has attracted great interests from the computer vision and pattern recognition community in recent years. State-of-the-art methods use concolutional neural networks (CNNs), recurrent neural networks with long short-term memory (RNN-LSTM) or the combination of them. In this paper, we investigate the intrinsic characteristics of text recognition, and inspired by human cognition mechanisms in reading texts, we propose a scene text recognition method with character models on convolutional feature map. The method simultaneously detects and recognizes characters by sliding the text line image with character models, which are learned end-to-end on text line images labeled with text transcripts. The character classifier outputs on the sliding windows are normalized and decoded with Connectionist Temporal Classification (CTC) based algorithm. Compared to previous methods, our method has a number of appealing properties: (1) It avoids the difficulty of character segmentation which hinders the performance of segmentation-based recognition methods; (2) The model can be trained simply and efficiently because it avoids gradient vanishing/exploding in training RNN-LSTM based models; (3) It bases on character models trained free of lexicon, and can recognize unknown words. (4) The recognition process is highly parallel and enables fast recognition. Our experiments on several challenging English and Chinese benchmarks, including the IIIT-5K, SVT, ICDAR03/13 and TRW15 datasets, demonstrate that the proposed method yields superior or comparable performance to state-of-the-art methods while the model size is relatively small.

Computer Vision and Pattern Recognition
Scene Text Detection via Holistic, Multi-Channel Prediction

Cong Yao,Xiang Bai,Nong Sang,Xinyu Zhou,Shuchang Zhou,Zhimin Cao

DOI: https://doi.org/10.48550/arXiv.1606.09002

2016-07-05

Abstract:Recently, scene text detection has become an active research topic in computer vision and document analysis, because of its great importance and significant challenge. However, vast majority of the existing methods detect text within local regions, typically through extracting character, word or line level candidates followed by candidate aggregation and false positive elimination, which potentially exclude the effect of wide-scope and long-range contextual cues in the scene. To take full advantage of the rich information available in the whole natural image, we propose to localize text in a holistic manner, by casting scene text detection as a semantic segmentation problem. The proposed algorithm directly runs on full images and produces global, pixel-wise prediction maps, in which detections are subsequently formed. To better make use of the properties of text, three types of information regarding text region, individual characters and their relationship are estimated, with a single Fully Convolutional Network (FCN) model. With such predictions of text properties, the proposed algorithm can simultaneously handle horizontal, multi-oriented and curved text in real-world natural images. The experiments on standard benchmarks, including ICDAR 2013, ICDAR 2015 and MSRA-TD500, demonstrate that the proposed algorithm substantially outperforms previous state-of-the-art approaches. Moreover, we report the first baseline result on the recently-released, large-scale dataset COCO-Text.

Computer Vision and Pattern Recognition
I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-Shaped Scene Text Detection

Du Bo,Ye Jian,Zhang Jing,Liu Juhua,Tao Dacheng

DOI: https://doi.org/10.1007/s11263-022-01616-6

IF: 13.369

2022-01-01

International Journal of Computer Vision

Abstract:Existing methods for arbitrary-shaped text detection in natural scenes face two critical issues, i.e., (1) fracture detections at the gaps in a text instance; and (2) inaccurate detections of arbitrary-shaped text instances with diverse background context. To address these issues, we propose a novel method named Intra- and Inter-Instance Collaborative Learning (I3CL). Specifically, to address the first issue, we design an effective convolutional module with multiple receptive fields, which is able to collaboratively learn better character and gap feature representations at local and long ranges inside a text instance. To address the second issue, we devise an instance-based transformer module to exploit the dependencies between different text instances and a global context module to exploit the semantic context from the shared background, which are able to collaboratively learn more discriminative text feature representation. In this way, I3CL can effectively exploit the intra- and inter-instance dependencies together in a unified end-to-end trainable framework. Besides, to make full use of the unlabeled data, we design an effective semi-supervised learning method to leverage the pseudo labels via an ensemble strategy. Without bells and whistles, experimental results show that the proposed I3CL sets new state-of-the-art results on three challenging public benchmarks, i.e., an F-measure of 77.5% on ArT, 86.9% on Total-Text, and 86.4% on CTW-1500. Notably, our I3CL with the ResNeSt-101 backbone ranked the 1st place on the ArT leaderboard. Code is available at www.github.com/ViTAE-Transformer/ViTAE-Transformer-Scene-Text-Detection .
Scene Text Recognition from Two-Dimensional Perspective

Minghui Liao,Jian Zhang,Zhaoyi Wan,Fengming Xie,Jiajun Liang,Pengyuan Lyu,Cong Yao,Xiang Bai

DOI: https://doi.org/10.48550/arXiv.1809.06508

2018-11-17

Abstract:Inspired by speech recognition, recent state-of-the-art algorithms mostly consider scene text recognition as a sequence prediction problem. Though achieving excellent performance, these methods usually neglect an important fact that text in images are actually distributed in two-dimensional space. It is a nature quite different from that of speech, which is essentially a one-dimensional signal. In principle, directly compressing features of text into a one-dimensional form may lose useful information and introduce extra noise. In this paper, we approach scene text recognition from a two-dimensional perspective. A simple yet effective model, called Character Attention Fully Convolutional Network (CA-FCN), is devised for recognizing the text of arbitrary shapes. Scene text recognition is realized with a semantic segmentation network, where an attention mechanism for characters is adopted. Combined with a word formation module, CA-FCN can simultaneously recognize the script and predict the position of each character. Experiments demonstrate that the proposed algorithm outperforms previous methods on both regular and irregular text datasets. Moreover, it is proven to be more robust to imprecise localizations in the text detection phase, which are very common in practice.

Computer Vision and Pattern Recognition
CRNet: A Center-aware Representation for Detecting Text of Arbitrary Shapes

Yu Zhou,Hongtao Xie,Shancheng Fang,Yan Li,Yongdong Zhang

DOI: https://doi.org/10.1145/3394171.3413565

2020-01-01

Abstract:Existing scene text detection methods achieve state-of-the-art performance by designing elaborate anchors or complex post-processing. Nonetheless, most methods still face the dilemma of detecting adjacent texts as one instance and long text with large character spacing as multiple fragments. To tackle these problems, we propose an anchor-free scene text detector leveraging Center-aware Representation to achieve accurate arbitrary-shaped scene text detection namely CRNet. Firstly, we propose a center-aware location algorithm to explicitly learn center regions and center points of text instances, which is able to separate adjacent text instances effectively. Then, a multi-scale context extraction module capable of extracting local context, long-range dependencies and global context adaptively is designed to effectively perceive long text with large character spacing. Finally, a low-level features enhancement block is introduced to enhance the geometric information of text. Extensive experiments conducted on several benchmarks including SCUT-CTW1500, Total-Text, ICDAR2015, ICDAR2017 MLT, and MSRA-TD500 demonstrate the effectiveness of our method. Specifically, without any anchor and complicated post-processing, our CRNet achieves 84.2% and 85.1% on CTW1500 and MSRA-TD500 in F-measure, outperforming all state-of-the-art anchor-based and anchor-free methods.
A holistic representation guided attention network for scene text recognition

Lu Yang,Peng Wang,Hui Li,Zhen Li,Yanning Zhang

DOI: https://doi.org/10.1016/j.neucom.2020.07.010

IF: 6

2020-11-01

Neurocomputing

Abstract:Reading irregular scene text of arbitrary shape in natural images is still a challenging problem, despite the progress made recently. Many existing approaches incorporate sophisticated network structures to handle various shapes, use extra annotations for stronger supervision, or employ hard-to-train recurrent neural networks for sequence modeling. In this work, we propose a simple yet strong approach for scene text recognition. With no need to convert input images to sequence representations, we directly connect two-dimensional CNN features to an attention-based sequence decoder which guided by holistic representation. The holistic representation can guide the attention-based decoder focus on more accurate area. As no recurrent module is adopted, our model can be trained in parallel. It achieves <math>1.5×</math> to <math>9.4×</math> acceleration to backward pass and <math>1.3×</math> to <math>7.9×</math> acceleration to forward pass, compared with the RNN counterparts. The proposed model is trained with only word-level annotations. With this simple design, our method achieves state-of-the-art or competitive recognition performance on the evaluated regular and irregular scene text benchmark datasets.

computer science, artificial intelligence
MOST: A Multi-Oriented Scene Text Detector with Localization Refinement

Minghang He,Minghui Liao,Zhibo Yang,Humen Zhong,Jun Tang,Wenqing Cheng,Cong Yao,Yongpan Wang,Xiang Bai

DOI: https://doi.org/10.48550/arXiv.2104.01070

2021-04-05

Abstract:Over the past few years, the field of scene text detection has progressed rapidly that modern text detectors are able to hunt text in various challenging scenarios. However, they might still fall short when handling text instances of extreme aspect ratios and varying scales. To tackle such difficulties, we propose in this paper a new algorithm for scene text detection, which puts forward a set of strategies to significantly improve the quality of text localization. Specifically, a Text Feature Alignment Module (TFAM) is proposed to dynamically adjust the receptive fields of features based on initial raw detections; a Position-Aware Non-Maximum Suppression (PA-NMS) module is devised to selectively concentrate on reliable raw detections and exclude unreliable ones; besides, we propose an Instance-wise IoU loss for balanced training to deal with text instances of different scales. An extensive ablation study demonstrates the effectiveness and superiority of the proposed strategies. The resulting text detection system, which integrates the proposed strategies with a leading scene text detector EAST, achieves state-of-the-art or competitive performance on various standard benchmarks for text detection while keeping a fast running speed.

Computer Vision and Pattern Recognition
Text-Attentional Convolutional Neural Network for Scene Text Detection

Tong He,Weilin Huang,Yu Qiao,Jian Yao

DOI: https://doi.org/10.1109/tip.2016.2547588

IF: 10.6

2016-06-01

IEEE Transactions on Image Processing

Abstract:Recent deep learning models have demonstrated strong capabilities for classifying text and non-text components in natural images. They extract a high-level feature globally computed from a whole image component (patch), where the cluttered background information may dominate true text features in the deep representation. This leads to less discriminative power and poorer robustness. In this paper, we present a new system for scene text detection by proposing a novel text-attentional convolutional neural network (Text-CNN) that particularly focuses on extracting text-related regions and features from the image components. We develop a new learning mechanism to train the Text-CNN with multi-level and rich supervised information, including text region mask, character label, and binary text/non-text information. The rich supervision information enables the Text-CNN with a strong capability for discriminating ambiguous texts, and also increases its robustness against complicated background components. The training process is formulated as a multi-task learning problem, where low-level supervised information greatly facilitates the main task of text/non-text classification. In addition, a powerful low-level detector called contrast-enhancement maximally stable extremal regions (MSERs) is developed, which extends the widely used MSERs by enhancing intensity contrast between text patterns and background. This allows it to detect highly challenging text patterns, resulting in a higher recall. Our approach achieved promising results on the ICDAR 2013 data set, with an F-measure of 0.82, substantially improving the state-of-the-art results.

computer science, artificial intelligence,engineering, electrical & electronic
Text-Attentional Convolutional Neural Networks for Scene Text Detection

Tong He,Weilin Huang,Yu Qiao,Jian Yao

DOI: https://doi.org/10.48550/arXiv.1510.03283

2015-10-12

Computer Vision and Pattern Recognition

Abstract:Recent deep learning models have demonstrated strong capabilities for classifying text and non-text components in natural images. They extract a high-level feature computed globally from a whole image component (patch), where the cluttered background information may dominate true text features in the deep representation. This leads to less discriminative power and poorer robustness. In this work, we present a new system for scene text detection by proposing a novel Text-Attentional Convolutional Neural Network (Text-CNN) that particularly focuses on extracting text-related regions and features from the image components. We develop a new learning mechanism to train the Text-CNN with multi-level and rich supervised information, including text region mask, character label, and binary text/nontext information. The rich supervision information enables the Text-CNN with a strong capability for discriminating ambiguous texts, and also increases its robustness against complicated background components. The training process is formulated as a multi-task learning problem, where low-level supervised information greatly facilitates main task of text/non-text classification. In addition, a powerful low-level detector called Contrast- Enhancement Maximally Stable Extremal Regions (CE-MSERs) is developed, which extends the widely-used MSERs by enhancing intensity contrast between text patterns and background. This allows it to detect highly challenging text patterns, resulting in a higher recall. Our approach achieved promising results on the ICDAR 2013 dataset, with a F-measure of 0.82, improving the state-of-the-art results substantially.
Specific category region proposal network for text detection in natural scene

Yuanhong Zhong,Xinyu Cheng,Zhaokun Zhou,Shun Zhang,Jing Zhang,Guan Huang

DOI: https://doi.org/10.1049/iet-ipr.2019.0652

IF: 2.3

2020-06-09

IET Image Processing

Abstract:Natural scene text usually carries considerable abstract semantic information, which is closely related to the surrounding environment. Thus, natural scene text detection plays a vital role in image content retrieval and understanding. In this study, the authors propose a novel specific category region proposal network (SCRPN) based on maximally stable extremal regions (MSER) and fully convolutional network (FCN) for natural scene text detection. First, FCN for pixel-level recognition is utilised to obtain the text saliency map and MSER is used to obtain oversegmented regions. Then, the multiple features of oversegmented regions and text saliency map are used for region aggregation. Next, single-linkage clustering method is adopted to cluster the segmentation regions to obtain a hierarchical structure of text region proposals. Finally, for the top-ranking region proposals, SCRPN built an end-to-end pipeline for scene text detection directly. Experiments on street view text and international conference on document analysis and recognition (ICDAR) 2013 have demonstrated the effectiveness of SCRPN for generating the text proposals. SCRPN could work with various two-stage text detection networks; thus, faster region convolutional neural network was used as the text detection framework to evaluate the performance of SCRPN in the ICDAR 2015 and MSRA-TD500 benchmarks. The experimental results confirmed that SCRPN makes text detection more robust in complex scenarios.

computer science, artificial intelligence,engineering, electrical & electronic,imaging science & photographic technology
Towards Accurate Scene Text Recognition with Semantic Reasoning Networks

Deli Yu,Xuan Li,Chengquan Zhang,Junyu Han,Jingtuo Liu,Errui Ding

DOI: https://doi.org/10.48550/arXiv.2003.12294

2020-03-27

Abstract:Scene text image contains two levels of contents: visual texture and semantic information. Although the previous scene text recognition methods have made great progress over the past few years, the research on mining semantic information to assist text recognition attracts less attention, only RNN-like structures are explored to implicitly model semantic information. However, we observe that RNN based methods have some obvious shortcomings, such as time-dependent decoding manner and one-way serial transmission of semantic context, which greatly limit the help of semantic information and the computation efficiency. To mitigate these limitations, we propose a novel end-to-end trainable framework named semantic reasoning network (SRN) for accurate scene text recognition, where a global semantic reasoning module (GSRM) is introduced to capture global semantic context through multi-way parallel transmission. The state-of-the-art results on 7 public benchmarks, including regular text, irregular text and non-Latin long text, verify the effectiveness and robustness of the proposed method. In addition, the speed of SRN has significant advantages over the RNN based methods, demonstrating its value in practical use.

Computer Vision and Pattern Recognition
Tk-Text: Multi-Shaped Scene Text Detection Via Instance Segmentation

Xiaoge Song,Yirui Wu,Wenhai Wang,Tong Lu

DOI: https://doi.org/10.1007/978-3-030-37734-2_17

2020-01-01

Abstract:Benefit from the development of deep neural networks, scene text detectors have progressed rapidly over the past few years and achieved outstanding performance on several standard benchmarks. However, most existing methods adopt quadrilateral bounding boxes to represent texts, which are usually inadequate to deal with multi-shaped texts such as the curved ones. To keep consist detection performance on both quadrilateral and curved texts, we present a novel representation, i.e., text kernel, for multi-shaped texts. On the basis of text kernel, we propose a simple yet effective scene text detection method, named as TK-Text. The proposed method consists of three steps, namely text-context-aware network, segmentation map generation and text kernel based post-clustering. During text-context-aware network, we construct a segmentation-based network to extract feature map from natural scene images, which are further enhanced with text context information extracted from an attention scheme TKAB. In segmentation map generation, text kernels and rough boundaries of text instances are segmented based on the enhanced feature map. Finally, rough text instances are gradually refined to generate accurate text instances by performing clustering based on text kernel. Experiments on public benchmarks including SCUT-CTW1500, ICDAR 2015 and ICDAR 2017 MLT demonstrate that the proposed method achieves competitive detection performance comparing with the existing methods.
Scene text recognition via context modeling for low-quality image in logistics industry

Herui Heng,Peiji Li,Tuxin Guan,Tianyu Yang

DOI: https://doi.org/10.1007/s40747-022-00916-1

IF: 6.7

2022-11-30

Complex & Intelligent Systems

Abstract:Abstract Text recognition has been applied in many fields recently, such as robot vision, video retrieval, and scene understanding. However, minimal research has been conducted in the field of logistics wherein images of express sheets captured by cameras are mostly curved, distorted, and have low resolution. In this study, a new method is proposed to address the aforementioned research gap while simultaneously considering irregular and low-resolution English letters. The entire approach comprises a rectification module, a convolutional neural network (CNN) extractor, a semantic context module (SCM), a global context module (GCM), and a lightweight transformer decoder that can exhibit improved training speed. In particular, we propose the idea of context modeling in our proposed method. (1) The proposed SCM is introduced to capture full-image dependencies and generates rich semantic context information. (2) We propose the GCM, which not only enhances long-range dependencies from the output of SCM but also outputs abundant pixel information to the self-attention decoder. (3) To solve the low-resolution text recognition problem in a large number of express sheet scenes, we propose Chinese datasets for improving intelligent logistics. Experiments conducted on six public benchmarks demonstrate that the developed method achieves better robustness to low-resolution and irregular text images.

computer science, artificial intelligence

Preterm birth in multiple pregnancy: a glimmer of hope?

Renal Function in Compensated Hepatic Cirrhosis: Effects of an Amino Acid Infusion and Relationship with Nitric Acid

A Convolutional Recurrent Neural-Network-Based Machine Learning for Scene Text Recognition Application

A Text-Context-Aware CNN Network for Multi-oriented and Multi-language Scene Text Detection.

MASTER: Multi-Aspect Non-local Network for Scene Text Recognition

CORE-Text: Improving Scene Text Detection with Contrastive Relational Reasoning

Character Region Awareness Network for Scene Text Recognition

Multi-oriented Scene Text Detection via Corner Localization and Region Segmentation

Scene Text Recognition with Sliding Convolutional Character Models

Scene Text Detection via Holistic, Multi-Channel Prediction

I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-Shaped Scene Text Detection

Scene Text Recognition from Two-Dimensional Perspective

CRNet: A Center-aware Representation for Detecting Text of Arbitrary Shapes

A holistic representation guided attention network for scene text recognition

MOST: A Multi-Oriented Scene Text Detector with Localization Refinement

Text-Attentional Convolutional Neural Network for Scene Text Detection

Text-Attentional Convolutional Neural Networks for Scene Text Detection

Specific category region proposal network for text detection in natural scene

Towards Accurate Scene Text Recognition with Semantic Reasoning Networks

Tk-Text: Multi-Shaped Scene Text Detection Via Instance Segmentation

Scene text recognition via context modeling for low-quality image in logistics industry