Decoupling Layout from Glyph in Online Chinese Handwriting Generation

Min-Si Ren,Yan-Ming Zhang,Yi Chen
2024-10-04
Abstract:Text plays a crucial role in the transmission of human civilization, and teaching machines to generate online handwritten text in various styles presents an interesting and significant challenge. However, most prior work has concentrated on generating individual Chinese fonts, leaving {complete text line generation largely unexplored}. In this paper, we identify that text lines can naturally be divided into two components: layout and glyphs. Based on this division, we designed a text line layout generator coupled with a diffusion-based stylized font synthesizer to address this challenge hierarchically. More concretely, the layout generator performs in-context-like learning based on the text content and the provided style references to generate positions for each glyph autoregressively. Meanwhile, the font synthesizer which consists of a character embedding dictionary, a multi-scale calligraphy style encoder, and a 1D U-Net based diffusion denoiser will generate each font on its position while imitating the calligraphy style extracted from the given style references. Qualitative and quantitative experiments on the CASIA-OLHWDB demonstrate that our method is capable of generating structurally correct and indistinguishable imitation samples.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to generate online Chinese handwritten text lines with correct structure and consistent style given the text content and writing - style reference samples. Specifically, the paper focuses on how to generate complete text lines, not just single Chinese characters or fonts. This involves two main challenges: 1. **Ensuring the structural correctness of each character**: Chinese characters have complex geometric structures and specific stroke orders. Therefore, maintaining the structural correctness of each character during the generation process is an important challenge. 2. **Arranging the relative positions of different characters**: Especially when dealing with the relative positions between Chinese characters and punctuation marks, it is necessary to ensure that the layout of the entire text line is natural and smooth. To address these challenges, the paper proposes a hierarchical method that separates text - line layout generation from the generation of individual characters. Specific methods include: - **Layout Generator**: Generate the position of each character according to the text content and the provided writing - style reference samples. The layout generator generates the bounding box of each character in an autoregressive manner. - **Font Synthesizer**: Based on the 1D U - Net network and multi - scale calligraphy - style encoder, generate the font of each character while imitating the calligraphy style extracted from the given style reference samples. Through this method, the paper aims to generate online Chinese handwritten text lines with correct structure and consistent style, thereby achieving a more natural and smooth writing effect.