Letter embedding guidance diffusion model for scene text editing

Changshuo Wang,Lei Wu,Xu Chen,Xiang Li,Lei Meng,Xiangxu Meng
DOI: https://doi.org/10.1109/ICME55011.2023.00107
2023-01-01
Abstract:Scene text editing(STE) aims to modify the text in the scene image to the target text while retaining the original style. Existing models are based on GAN, where the source image and the target text are input only once during the generation process, and this approach could not fully obtain the style of the source image and content of the target text. In this paper, we propose an STE method based on the classifier-free guidance diffusion model. To our best knowledge, our model is the first work that developed diffusion models to handle the STE task. Specifically, we divide the STE task into multiple steps and extract style information and text content information in each step. In addition, we introduce the letter embedding method as guidance. We experimentally prove that our method outperforms other STE models in terms of overall realism and maintaining glyphs.
What problem does this paper attempt to address?