A Text-Specific Domain Adaptive Network for Scene Text Detection in the Wild.

Xuan He,Jin Yuan,Mengyao Li,Runmin Wang,Haidong Wang,Zhiyong Li
DOI: https://doi.org/10.1007/s10489-023-04873-1
IF: 5.3
2023-01-01
Applied Intelligence
Abstract:Scene text detection has drawn increasing attention due to its potential scalability to large-scale applications. Currently, a well-trained scene text detection model on a source domain usually has unsatisfactory performance when it is migrated to e large domain shift between them. To bridge this gap, this paper proposes a novel network integrates both text-specific Faster R-CNN (ts-FRCNN) and domain adaptation (ts-DA) into one framework. Compared to conventional FRCNN, ts-FRCNN designs a text-specific RPN to generate more accurate region proposals by considering the inherent characters of scene text, as well as text-specific RoI pooling to extract purer and sufficient fine-grained text features by adopting an adaptive asymmetric griding strategy. Compared to conventional domain adaptation, ts-DA adopts a triple-level alignment strategy to reduce the domain shift at the image, word and character levels, and builds a triple-consistency regularization among them, which significantly promotes domain-invariant text feature learning. We conduct extensive experiments on three representative transfer learning tasks: common-to-extreme scenes, real-to-real scenes and synthetic-to-real scenes. The experimental results demonstrate that our model consistently outperforms the previous methods.
What problem does this paper attempt to address?