Scene Text Recognition Via Gated Cascade Attention

Siwei Wang,Yongtao Wang,Xiaoran Qin,Qijie Zhao,Zhi Tang
DOI: https://doi.org/10.1109/icme.2019.00179
2019-01-01
Abstract:Scene text recognition is very challenging due to the complex background, low resolution, perspective distortion and curved placement, etc. Most of the state-of-the-art methods adopt the attention-based encoder-decoder framework, and usually get suboptimal recognition performance for challenging text images due to the misalignment between attention region and target character region. In this paper, a novel module, named Gated Cascade Attention Module (GCAM), is proposed to increase the alignment precision of attention in a cascade way. Moreover, a channel and spatial attention module is introduced into the encoder to extract more discriminative features for text recognition. By assembling these two modules, a novel scene text recognizer is developed, and extensive experiments demonstrate it can achieve state-of-the-art results on multiple benchmarks of regular and irregular text images.
What problem does this paper attempt to address?