Abstract:Few-shot image generation aims to generate diverse and high-quality images for an unseen class given only a few examples in that class. However, existing methods often suffer from a trade-off between image quality and diversity while offering limited control over the attributes of newly generated images. In this work, we propose Hyperbolic Diffusion Autoencoders (HypDAE), a novel approach that operates in hyperbolic space to capture hierarchical relationships among images and texts from seen categories. By leveraging pre-trained foundation models, HypDAE generates diverse new images for unseen categories with exceptional quality by varying semantic codes or guided by textual instructions. Most importantly, the hyperbolic representation introduces an additional degree of control over semantic diversity through the adjustment of radii within the hyperbolic disk. Extensive experiments and visualizations demonstrate that HypDAE significantly outperforms prior methods by achieving a superior balance between quality and diversity with limited data and offers a highly controllable and interpretable generation process.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the trade - off between image quality and diversity in **Few - shot Image Generation**, as well as the limited ability to control the properties of newly generated images. Specifically, existing methods face challenges in generating high - quality and diverse images, and it is difficult to effectively generate images of unseen classes with only a small number of examples. ### Problem Background 1. **Challenges in Few - shot Image Generation**: - Existing methods usually have a trade - off between image quality and diversity. - Generating diverse and high - quality images using only a small number of samples for unseen classes is a difficult problem. - Existing methods are difficult to precisely control the specific properties of newly generated images. 2. **Limitations of Existing Methods**: - GAN - based methods face challenges when generating diverse and high - quality images. - Representation methods in Euclidean space are difficult to capture the hierarchical structure between image classes, resulting in limited quality and diversity of generated images. ### Solution To overcome the above problems, the authors propose **Hyperbolic Diffusion Autoencoders (HypDAE)**, a new method that operates in hyperbolic space. The main contributions of HypDAE include: 1. **Introducing Hyperbolic Space**: - Hyperbolic space can naturally represent hierarchical structures and is suitable for capturing the complex semantic relationships between images and text. - By adjusting the radius in the Poincaré disk, the semantic diversity of generated images can be flexibly controlled. 2. **Combining Diffusion Models**: - Using pre - trained diffusion models (such as Stable Diffusion), high - quality and diverse images can be generated with a small amount of data. - Diffusion models provide more abundant details and higher generation quality. 3. **Controllable Image Editing**: - HypDAE supports image editing through text guidance, allowing users to specify the direction and features of generated images. - By adjusting the parameters in hyperbolic space, flexible control of generated images can be achieved. ### Experimental Results Experiments show that HypDAE significantly outperforms existing methods on multiple datasets, can generate diverse images while maintaining high quality, and provides better controllability and interpretability. ### Summary This paper solves the trade - off between image quality and diversity in few - shot image generation by introducing hyperbolic space and diffusion models, and provides a new method to generate high - quality, diverse and controllable images.

Diffusion Autoencoders for Few-shot Image Generation in Hyperbolic Space

The Euclidean Space is Evil: Hyperbolic Attribute Editing for Few-shot Image Generation

AnomalyDiffusion: Few-Shot Anomaly Image Generation with Diffusion Model

Few-shot Image Generation with Diffusion Models

Hierarchical Diffusion Autoencoders and Disentangled Image Manipulation

Few-shot 3D Shape Generation

Few-shot Hybrid Domain Adaptation of Image Generator

Few-Shot Image Generation by Conditional Relaxing Diffusion Inversion

Autoencoding Hyperbolic Representation for Adversarial Generation

Few-shot Image Generation Using Discrete Content Representation

Few-shot Image Generation via Information Transfer from the Built Geodesic Surface

Diverse Diffusion: Enhancing Image Diversity in Text-to-Image Generation

DeltaGAN: Towards Diverse Few-Shot Image Generation with Sample-Specific Delta.

Image Generation Diversity Issues and How to Tame Them

Conditional Distribution Modelling for Few-Shot Image Synthesis with Diffusion Models

DIAGen: Diverse Image Augmentation with Generative Models

Diversity-enhancing Generative Network for Few-shot Hypothesis Adaptation

High-Quality and Diverse Few-Shot Image Generation via Masked Discrimination

Customize Your Own Paired Data via Few-shot Way

Causal Diffusion Autoencoders: Toward Counterfactual Generation via Diffusion Probabilistic Models