An Unsupervised Font Style Transfer Model Based on Generative Adversarial Networks.

Zeng Sihan,Pan Zhongliang
DOI: https://doi.org/10.1007/s11042-021-11777-0
IF: 2.577
2021-01-01
Multimedia Tools and Applications
Abstract:Chinese characters, because of their complex structure and a large number, lead to an extremely high cost of time for designers to design a complete set of characters. As a result, the dramatic growth of characters used in various fields such as culture and business has formed a strong contradiction between supply and demand with Chinese font design. Although most of the existing Chinese characters transformation models greatly alleviate the demand for character usage, the semantics of the generated characters cannot be guaranteed and the generation efficiency is low. At the same time, the models require large amounts of paired data for training, which requires a large amount of sample processing time. To address the problems of existing methods, this paper proposes an unsupervised Chinese characters generation method based on generative adversarial networks, which fuses Style-Attentional Net to a skip-connected U-Net as a GAN generator network architecture. It effectively and flexibly integrates local style patterns based on the semantic spatial distribution of content images while retaining feature information of different sizes. Our model generates fonts that maintain the source domain content features and the target domain style features at the end of training. The addition of the style specification module and the classification discriminator allows the model to generate multiple style typefaces. The generation results show that the model proposed in this paper can perform the task of Chinese character style transfer well. The model generates high-quality images of Chinese characters and generates Chinese characters with complete structures and natural strokes. In the quantitative comparison experiments and qualitative comparison experiments, our model has more superior visual effects and image performance indexes compared with the existing models. In sample size experiments, clearly structured fonts are still generated and the model demonstrates significant robustness. At the same time, the training conditions of our model are easy to meet and facilitate generalization to real applications.
What problem does this paper attempt to address?