Fenet: Feature Enhancement Network for Arbitrary Direction Text Detection
Runmin Wang,Yingying Liu,Chang Han,Guilin Zhu,Minghao Liu,Hua Chen,Yajun Ding,Changxin Gao,Nong Sang
DOI: https://doi.org/10.2139/ssrn.4161312
2022-01-01
SSRN Electronic Journal
Abstract:With the development of deep learning, scene text detection becomes a frequently included task in wide range of applications in areas, such as automatic driving, scene text translation, and scene semantic understanding, etc. Existing frameworks usually encode the text candidates independently by using some hand-crafted features or deep convolution in series (e.g., ResNet, VGGNet, etc). These methods encounter some difficulties, such as the variety of text shapes, complex backgrounds, and only relatively long information can be remembered, etc. Instead, our proposed network, named as Feature Enhancement Network (FENet), can effectively deal with the arbitrary direction text detection by integrating attention mechanism and Bi-directional Long Short-Term Memory (BLSTM). There are three key characteristics in our method: (i) The feature maps is recurrently strengthened by integrating attention mechanism based CBAM module and BLSTM-based sequence feature extraction module. (ii) Updating the components of the bottleneck layer of ResNet for facilitating the network micro-batch training. (iii) The learnability of the network is enhanced by introducing the residual network structure in the BLSTM-based sequence feature extraction module. Extensive experiments have been carried out on five public datasets, i.e, ICDAR 2013, ICDAR 2015, MSRA-TD500, SCUT-CTW1500 and ICDAR2017 -MLT, the experiment results demonstrate that the proposed method achieves the promissing results. We will release code at the website https://github.com/lyy0117 to facilitate community research in future.