Abstract:Generative Adversarial Networks (GANs) have emerged as a prominent research focus for image editing tasks, leveraging the powerful image generation capabilities of the GAN framework to produce remarkable <a class="link-external link-http" href="http://results.However" rel="external noopener nofollow">this http URL</a>, prevailing approaches are contingent upon extensive training datasets and explicit supervision, presenting a significant challenge in manipulating the diverse attributes of new image classes with limited sample availability. To surmount this hurdle, we introduce TAGE, an innovative image generation network comprising three integral modules: the Codebook Learning Module (CLM), the Code Prediction Module (CPM) and the Prompt-driven Semantic Module (PSM). The CPM module delves into the semantic dimensions of category-agnostic attributes, encapsulating them within a discrete codebook. This module is predicated on the concept that images are assemblages of attributes, and thus, by editing these category-independent attributes, it is theoretically possible to generate images from unseen categories. Subsequently, the CPM module facilitates naturalistic image editing by predicting indices of category-independent attribute vectors within the codebook. Additionally, the PSM module generates semantic cues that are seamlessly integrated into the Transformer architecture of the CPM, enhancing the model's comprehension of the targeted attributes for editing. With these semantic cues, the model can generate images that accentuate desired attributes more prominently while maintaining the integrity of the original category, even with a limited number of samples. We have conducted extensive experiments utilizing the Animal Faces, Flowers, and VGGFaces datasets. The results of these experiments demonstrate that our proposed method not only achieves superior performance but also exhibits a high degree of stability when compared to other few-shot image generation techniques.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: in the case of a limited number of samples, how to stably edit and generate high - quality images of new categories. Specifically, the existing few - shot image generation methods rely on a large amount of training data and explicit supervision, and it is difficult to handle the diverse attribute editing of new categories. To solve this problem, the authors proposed TAGE (Trustworthy Attribute Group Editing for Stable Few - shot Image Generation), an innovative image - generation network. ### Core of the problem 1. **Limitations of existing methods**: - Existing few - shot image generation methods are mainly divided into three categories: optimization - based methods, fusion - based methods, and transformation - based methods. These methods have the following problems: - **Optimization - based methods**: Although a general model can be trained through meta - learning, it performs poorly in capturing the details of unseen categories, resulting in less realistic generated images. - **Fusion - based methods**: They require a high similarity between input images and have a high computational cost, which limits their application range. - **Transformation - based methods**: The process of learning and applying intra - class variations is complex and unstable, leading to training difficulties. 2. **Challenges of editing methods**: - Editing methods generate images by editing attributes, but lack the effective ability to edit the attributes of unseen categories. For example, the AGE (Attribute Group Editing) method may cause organs to disappear or the image to collapse when generating images, affecting the image quality. ### TAGE's solutions TAGE solves the above problems by introducing three key modules: 1. **Codebook Learning Module (CLM)**: - Use unlabeled image recognition semantic directions to construct a discrete codebook to recombine known attributes to generate unseen - category images. - Limit the latent space and store high - quality reconstruction elements, thereby improving the image quality. 2. **Code Prediction Module (CPM)**: - Predict latent codes to ensure accurate attribute editing under conditions of limited data or high diversity. - Use global combination information and long - range dependencies to predict codes and improve the diversity of generated images. 3. **Prompt - driven Semantic Module (PSM)**: - Generate semantic prompts to guide the CPM to perform fine - grained attribute operations while maintaining consistency. - Inject semantically - guided prompts into the Transformer layer to achieve high - quality image generation and editing. ### Experimental verification The paper conducted extensive experiments on three datasets, Animal Faces, Flowers, and VGGFaces, and the results show that TAGE not only has superior performance but also shows higher stability in few - shot image generation tasks. ### Summary TAGE aims to solve the problems of stability and high - quality generation in few - shot image generation. By introducing a discrete codebook, a code prediction module, and a semantically - driven prompt module, it realizes the efficient editing and generation of unseen - category images.

TAGE: Trustworthy Attribute Group Editing for Stable Few-shot Image Generation

TcGAN: Semantic-Aware and Structure-Preserved GANs with Individual Vision Transformer for Fast Arbitrary One-Shot Image Generation

Attack Deterministic Conditional Image Generative Models for Diverse and Controllable Generation

DFSGAN: Introducing editable and representative attributes for few-shot image generation

Editable Generative Adversarial Networks: Generating and Editing Faces Simultaneously

EditGAN: High-Precision Semantic Image Editing

A novel attribute-based generation architecture for facial image editing

Face Attribute Editing Based on Generative Adversarial Networks

Progressive Editing with Stacked Generative Adversarial Network for Multiple Facial Attribute Editing

AttGAN: Facial Attribute Editing by Only Changing What You Want

Sparsely Grouped Multi-Task Generative Adversarial Networks for Facial Attribute Manipulation.

From External to Internal: Structuring Image for Text-to-Image Attributes Manipulation

STGAN: A Unified Selective Transfer Network for Arbitrary Image Attribute Editing

FEditNet++: Few-Shot Editing of Latent Semantics in GAN Spaces with Correlated Attribute Disentanglement

PA-GAN: Progressive Attention Generative Adversarial Network for Facial Attribute Editing

Enjoy Your Editing: Controllable GANs for Image Editing via Latent Space Navigation

Face editing based on facial recognition features

Generate Desired Images from Trained Generative Adversarial Networks.

Facial Attribute Editing Using Generative Adversarial Network

Controllable Multi-Attribute Editing of High-Resolution Face Images

Controllable Image Synthesis with Attribute-Decomposed GAN