VQ-Font: Few-Shot Font Generation with Structure-Aware Enhancement and Quantization

Mingshuai Yao,Yabo Zhang,Xianhui Lin,Xiaoming Li,Wangmeng Zuo

2023-08-27

Abstract:Few-shot font generation is challenging, as it needs to capture the fine-grained stroke styles from a limited set of reference glyphs, and then transfer to other characters, which are expected to have similar styles. However, due to the diversity and complexity of Chinese font styles, the synthesized glyphs of existing methods usually exhibit visible artifacts, such as missing details and distorted strokes. In this paper, we propose a VQGAN-based framework (i.e., VQ-Font) to enhance glyph fidelity through token prior refinement and structure-aware enhancement. Specifically, we pre-train a VQGAN to encapsulate font token prior within a codebook. Subsequently, VQ-Font refines the synthesized glyphs with the codebook to eliminate the domain gap between synthesized and real-world strokes. Furthermore, our VQ-Font leverages the inherent design of Chinese characters, where structure components such as radicals and character components are combined in specific arrangements, to recalibrate fine-grained styles based on references. This process improves the matching and fusion of styles at the structure level. Both modules collaborate to enhance the fidelity of the generated fonts. Experiments on a collected font dataset show that our VQ-Font outperforms the competing methods both quantitatively and qualitatively, especially in generating challenging styles.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The paper attempts to address the issues in the few-shot font generation task, where existing methods often exhibit detail loss and stroke distortion when synthesizing Chinese characters. Due to the diverse and complex styles of Chinese characters, existing methods usually produce noticeable artifacts, such as missing details and distorted strokes. Specifically, the paper proposes a VQGAN-based framework (i.e., VQ-Font) to improve glyph fidelity through token prior refinement and structure-aware enhancement. It mainly addresses the following issues: 1. **Detail Loss and Stroke Distortion**: Existing methods tend to suffer from detail loss and stroke distortion when dealing with complex styles (e.g., serif and artistic fonts). VQ-Font encapsulates font token priors through a pre-trained VQGAN model and codebook, thereby eliminating the domain gap between synthetic strokes and real-world strokes. 2. **Insufficient Utilization of Structural Information**: Existing methods often overlook the structural information of Chinese characters, making it difficult to accurately match and blend styles during style transfer. VQ-Font introduces a Structure-level Style Enhancement Module (SSEM) that leverages the specific arrangement of structural components (such as radicals and character components) to recalibrate fine-grained styles, thus better matching and blending styles at the structural level. 3. **Limited Generalization Ability**: Existing methods perform poorly when handling unseen font styles. VQ-Font enhances the generalization ability to unseen font styles through rich prior knowledge and structural information. In summary, the main goal of this paper is to improve the synthesis quality in the few-shot font generation task by addressing issues such as detail loss, stroke distortion, and insufficient utilization of structural information in existing methods.

VQ-Font: Few-Shot Font Generation with Structure-Aware Enhancement and Quantization

Few shot font generation via transferring similarity guided global style and quantization local style

ZiGAN: Fine-grained Chinese Calligraphy Font Generation via a Few-shot Style Transfer Approach

Few-Shot Font Generation by Learning Fine-Grained Local Styles

Few-shot Font Generation based on SAE and Diffusion Model

Few-shot Font Generation with Localized Style Representations and Factorization

FontTransformer: Few-shot High-resolution Chinese Glyph Image Synthesis via Stacked Transformers

XMP-Font: Self-Supervised Cross-Modality Pre-training for Few-Shot Font Generation

QT-Font: High-efficiency Font Synthesis Via Quadtree-based Diffusion Models

MA-Font: Few-Shot Font Generation by Multi-Adaptation Method

CF-Font: Content Fusion for Few-shot Font Generation

Handwritten Chinese Font Generation with Collaborative Stroke Refinement.

HFH-Font: Few-shot Chinese Font Synthesis with Higher Quality, Faster Speed, and Higher Resolution

DeepCalliFont: Few-shot Chinese Calligraphy Font Synthesis by Integrating Dual-modality Generative Models

Diff-Font: Diffusion Model for Robust One-Shot Font Generation

FontGAN: A Unified Generative Framework for Chinese Character Stylization and De-stylization

Few-shot Font Generation with Weakly Supervised Localized Representations

FET-GAN: Font and Effect Transfer via K-shot Adaptive Instance Normalization

Few-shot Font Generation by Learning Style Difference and Similarity

Efficient and Scalable Chinese Vector Font Generation via Component Composition

An end-to-end chinese font generation network with stroke semantics and deformable attention skip-connection