VQ-Font: Few-Shot Font Generation with Structure-Aware Enhancement and Quantization

Mingshuai Yao,Yabo Zhang,Xianhui Lin,Xiaoming Li,Wangmeng Zuo
2023-08-27
Abstract:Few-shot font generation is challenging, as it needs to capture the fine-grained stroke styles from a limited set of reference glyphs, and then transfer to other characters, which are expected to have similar styles. However, due to the diversity and complexity of Chinese font styles, the synthesized glyphs of existing methods usually exhibit visible artifacts, such as missing details and distorted strokes. In this paper, we propose a VQGAN-based framework (i.e., VQ-Font) to enhance glyph fidelity through token prior refinement and structure-aware enhancement. Specifically, we pre-train a VQGAN to encapsulate font token prior within a codebook. Subsequently, VQ-Font refines the synthesized glyphs with the codebook to eliminate the domain gap between synthesized and real-world strokes. Furthermore, our VQ-Font leverages the inherent design of Chinese characters, where structure components such as radicals and character components are combined in specific arrangements, to recalibrate fine-grained styles based on references. This process improves the matching and fusion of styles at the structure level. Both modules collaborate to enhance the fidelity of the generated fonts. Experiments on a collected font dataset show that our VQ-Font outperforms the competing methods both quantitatively and qualitatively, especially in generating challenging styles.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper attempts to address the issues in the few-shot font generation task, where existing methods often exhibit detail loss and stroke distortion when synthesizing Chinese characters. Due to the diverse and complex styles of Chinese characters, existing methods usually produce noticeable artifacts, such as missing details and distorted strokes. Specifically, the paper proposes a VQGAN-based framework (i.e., VQ-Font) to improve glyph fidelity through token prior refinement and structure-aware enhancement. It mainly addresses the following issues: 1. **Detail Loss and Stroke Distortion**: Existing methods tend to suffer from detail loss and stroke distortion when dealing with complex styles (e.g., serif and artistic fonts). VQ-Font encapsulates font token priors through a pre-trained VQGAN model and codebook, thereby eliminating the domain gap between synthetic strokes and real-world strokes. 2. **Insufficient Utilization of Structural Information**: Existing methods often overlook the structural information of Chinese characters, making it difficult to accurately match and blend styles during style transfer. VQ-Font introduces a Structure-level Style Enhancement Module (SSEM) that leverages the specific arrangement of structural components (such as radicals and character components) to recalibrate fine-grained styles, thus better matching and blending styles at the structural level. 3. **Limited Generalization Ability**: Existing methods perform poorly when handling unseen font styles. VQ-Font enhances the generalization ability to unseen font styles through rich prior knowledge and structural information. In summary, the main goal of this paper is to improve the synthesis quality in the few-shot font generation task by addressing issues such as detail loss, stroke distortion, and insufficient utilization of structural information in existing methods.