Abstract:Highlights • Region-Aware Contrastive loss (RAC-loss) maximizes the style information between patches of generated and reference images, which self-supervises the generator in local style. • With the proposed RAC-loss, our one-shot font generator (RAC-Font) outperforms the previous method including both few-shot and one-shot methods in quantitative and qualitative terms. • Through the self-supervision of local style, we propose a model with a much simpler structure than previous methods and allows for real-time inference. • Proposed method considers more fine-grained level style (patch) compared to previous component-level style, resulting in more fine-grained font image. Compositional scripts like Hangeul (Korean characters) and Chinese characters involve numerous characters, making manual font design labor-intensive and cost-ineffective work. Although many few-shot font generation methods have been introduced, they have at least one of the limitations, i.e. , lacking local styles of font, additional component labeling, and high complexity in network structure and training. To solve these limitations, given our observation that font style can be perceived at a patch-level rather than a component-level, we propose Region-Aware Contrastive loss (RAC-loss) so that the generator can capture the local style by self-supervision. The proposed loss maximizes the style information between patches of the generated image and the style reference image. And we introduce an attention mechanism to the patch-level contrastive loss to handle multiple patch correspondences. This attention learns style similarity between two glyph images, which serves as a patch-correspondence map. RAC-loss gives more fine-grained feedback to the generator than component-level loss, allowing it to incorporate local styles, even in a straightforward structure like a visual geometry group network (VGGNet). This results in a fast inference latency (3.02ms), and the proposed method achieved 43.18 mean Fréchet Inception Distance (mFID) on the test dataset, a notable decrease of 5.42 compared to the previous method.

Look Closer to Supervise Better: One-Shot Font Generation Via Component-Based Discriminator

Few-shot Font Generation with Localized Style Representations and Factorization

Few-shot Font Generation with Weakly Supervised Localized Representations

XMP-Font: Self-Supervised Cross-Modality Pre-training for Few-Shot Font Generation

TcGAN: Semantic-Aware and Structure-Preserved GANs with Individual Vision Transformer for Fast Arbitrary One-Shot Image Generation

FontGAN: A Unified Generative Framework for Chinese Character Stylization and De-stylization

Diff-Font: Diffusion Model for Robust One-Shot Font Generation

ZiGAN: Fine-grained Chinese Calligraphy Font Generation via a Few-shot Style Transfer Approach

Few-shot Font Generation based on SAE and Diffusion Model

Few-Shot Font Generation by Learning Fine-Grained Local Styles

Decoupled Representation Learning for Character Glyph Synthesis

FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive Learning

Few shot font generation via transferring similarity guided global style and quantization local style

FET-GAN: Font and Effect Transfer via K-shot Adaptive Instance Normalization

Few-shot Font Generation by Learning Style Difference and Similarity

One-shot font generation via local style self-supervision using Region-Aware Contrastive Loss

MA-Font: Few-Shot Font Generation by Multi-Adaptation Method

Pyramid Embedded Generative Adversarial Network for Automated Font Generation

Handwritten Chinese Font Generation with Collaborative Stroke Refinement.

StrokeGAN: Reducing Mode Collapse in Chinese Font Generation via Stroke Encoding

VQ-Font: Few-Shot Font Generation with Structure-Aware Enhancement and Quantization