Adaptive Importance Pooling Network for Scene Text Recognition

Peng Ren,Qingsong Yu,Xuanqi Wu,Ziyang Wang
DOI: https://doi.org/10.1145/3404555.3404614
2020-01-01
Abstract:Scene text recognition (STR) has attracted extensive attention in pattern recognition community. With the development of deep learning, the object detection and sequence recognition schemes based on deep neural networks have been widely used in this task. Crucially, the discriminative features play a vital role in complex scene text backgrounds. However, for specific tasks, inappropriate pooling strategies may lose feature details. To tackle this problem, in this paper, an end-to-end based on adaptive importance pooling network (AIPN) is proposed. Concretely, we embed the novel AIP strategy into feature extraction stage. Additionally, we adopt the attention-based LSTM as decoder so that the useful image feature information regions are automatically focused while predicting final recognition results. Furthermore, to reduce the burden of feature representation for the next recognition, text rectification network (TRN) supervised by text recognition parts is utilized to normalize the input text images. Experimental results show that our model achieves inspiring performances on STR benchmark datasets IIIT5K, SVT, ICDAR-2003 and ICDAR-2013.
What problem does this paper attempt to address?