Abstract:Highlights • Region-Aware Contrastive loss (RAC-loss) maximizes the style information between patches of generated and reference images, which self-supervises the generator in local style. • With the proposed RAC-loss, our one-shot font generator (RAC-Font) outperforms the previous method including both few-shot and one-shot methods in quantitative and qualitative terms. • Through the self-supervision of local style, we propose a model with a much simpler structure than previous methods and allows for real-time inference. • Proposed method considers more fine-grained level style (patch) compared to previous component-level style, resulting in more fine-grained font image. Compositional scripts like Hangeul (Korean characters) and Chinese characters involve numerous characters, making manual font design labor-intensive and cost-ineffective work. Although many few-shot font generation methods have been introduced, they have at least one of the limitations, i.e. , lacking local styles of font, additional component labeling, and high complexity in network structure and training. To solve these limitations, given our observation that font style can be perceived at a patch-level rather than a component-level, we propose Region-Aware Contrastive loss (RAC-loss) so that the generator can capture the local style by self-supervision. The proposed loss maximizes the style information between patches of the generated image and the style reference image. And we introduce an attention mechanism to the patch-level contrastive loss to handle multiple patch correspondences. This attention learns style similarity between two glyph images, which serves as a patch-correspondence map. RAC-loss gives more fine-grained feedback to the generator than component-level loss, allowing it to incorporate local styles, even in a straightforward structure like a visual geometry group network (VGGNet). This results in a fast inference latency (3.02ms), and the proposed method achieved 43.18 mean Fréchet Inception Distance (mFID) on the test dataset, a notable decrease of 5.42 compared to the previous method.

Diff-Font: Diffusion Model for Robust One-Shot Font Generation

Few-shot Font Generation based on SAE and Diffusion Model

FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive Learning

DiffCJK: Conditional Diffusion Model for High-Quality and Wide-coverage CJK Character Generation

Chinese Character Font Generation Based on Diffusion Model

FontStudio: Shape-Adaptive Diffusion Model for Coherent and Consistent Font Effect Generation

QT-Font: High-efficiency Font Synthesis Via Quadtree-based Diffusion Models

Emage: Non-Autoregressive Text-to-Image Generation

FET-GAN: Font and Effect Transfer via K-shot Adaptive Instance Normalization

GlyphDiffusion: Text Generation as Image Generation

Few-shot Font Generation with Localized Style Representations and Factorization

GRIF-DM: Generation of Rich Impression Fonts using Diffusion Models

One-shot font generation via local style self-supervision using Region-Aware Contrastive Loss

MA-Font: Few-Shot Font Generation by Multi-Adaptation Method

Arbitrary Font Generation by Encoder Learning of Disentangled Features

Few-shot Font Generation by Learning Style Difference and Similarity

FontGAN: A Unified Generative Framework for Chinese Character Stylization and De-stylization

VQ-Font: Few-Shot Font Generation with Structure-Aware Enhancement and Quantization

XMP-Font: Self-Supervised Cross-Modality Pre-training for Few-Shot Font Generation

Few-shot Font Generation with Weakly Supervised Localized Representations

Few shot font generation via transferring similarity guided global style and quantization local style