$$\Mathrm T^2$$Net: an Improved Image-Based Text Transfer Framework Using Background Inpainting and Text Conversion

Haibin Zhou,Lujiao Shao,Boxiang Jia,Haijun Zhang
DOI: https://doi.org/10.1007/s44244-023-00010-6
2023-01-01
Industrial Artificial Intelligence
Abstract:Text, which is regarded as one of the important clues for visual recognition, can provide rich and accurate high-level semantic information. Therefore, the detection and recognition of textual data have become a research hotspot in computer vision and artificial intelligence. However, the difficulty of data collection and the non-uniform distribution of characters still poses challenges for accurate text recognition, especially for recognizing complicated character sets, such as Chinese. To address small-sample text recognition, we propose an improved image-based text transfer framework, named T^2 Net. This work can replace or modify the text content in an image so as to arbitrarily expand a recognition data set. Considering that the main challenge of text transfer lies in decoupling the complex interrelationship between text and background, a text content mask branch is first added into a background inpainting module so as to more realistically restore background textures. Second, a text recognition model is developed to guide the readability of the text transfer results in the text conversion module. Finally, a text fusion module is used to fuse the independent migrations of background and text. We examined the performance of our proposed framework in a real-word scene text recognition data set. Qualitative and quantitative results have proved the efficiency of our method in comparison with previous works.
What problem does this paper attempt to address?