Abstract:Class-conditional image generation using generative adversarial networks (GANs) has been investigated through various techniques; however, it continues to face challenges such as mode collapse, training instability, and low-quality output in cases of datasets with high intra-class variation. Furthermore, most GANs often converge in larger iterations, resulting in poor iteration efficacy in training procedures. While Diffusion-GAN has shown potential in generating realistic samples, it has a critical limitation in generating class-conditional samples. To overcome these limitations, we propose a novel approach for class-conditional image generation using GANs called DuDGAN, which incorporates a dual diffusion-based noise injection process. Our method consists of three unique networks: a discriminator, a generator, and a classifier. During the training process, Gaussian-mixture noises are injected into the two noise-aware networks, the discriminator and the classifier, in distinct ways. This noisy data helps to prevent overfitting by gradually introducing more challenging tasks, leading to improved model performance. As a result, our method outperforms state-of-the-art conditional GAN models for image generation in terms of performance. We evaluated our method using the AFHQ, Food-101, and CIFAR-10 datasets and observed superior results across metrics such as FID, KID, Precision, and Recall score compared with comparison models, highlighting the effectiveness of our approach.

What problem does this paper attempt to address?

The problems that this paper attempts to solve are common challenges in Conditional Generative Adversarial Networks (cGANs), such as mode collapse, training instability and low - quality output, especially when dealing with datasets with high intra - class variation. Moreover, most cGANs require a large number of iterations to converge, which leads to low iteration efficiency during the training process. Although Diffusion Models have shown potential in generating real samples, they have key limitations when generating conditional samples. For this reason, the authors propose a new method - DuDGAN, which improves the quality and stability of conditional image generation by introducing a Dual - Diffusion noise injection process. Specifically, the DuDGAN method consists of three unique networks: the Discriminator, the Generator and the Classifier. During the training process, Gaussian mixture noise is injected in different ways into two noise - aware networks - the Discriminator and the Classifier respectively. This noisy data helps prevent over - fitting and improves model performance by gradually introducing more challenging tasks. The results show that this method outperforms the existing state - of - the - art conditional GAN models in image generation tasks, especially on the AFHQ, Food - 101 and CIFAR - 10 datasets, where evaluation metrics such as FID, KID, Precision and Recall scores are all better than those of the comparison models. In this way, DuDGAN not only improves the quality and diversity of generated images, but also achieves rapid convergence and improves the iteration efficiency of training, thus effectively solving several key problems in conditional GAN training.

DuDGAN: Improving Class-Conditional GANs via Dual-Diffusion

DuDGAN: Improving Class-Conditional GANs via Dual-Diffusion

Dual Distribution Matching GAN

Attack Deterministic Conditional Image Generative Models for Diverse and Controllable Generation

Exploring Guided Sampling of Conditional GANs

Distilling Diffusion Models into Conditional GANs

Dual Discriminator Weighted Mixture Generative Adversarial Network for image generation

Turning Waste into Wealth: Leveraging Low-Quality Samples for Enhancing Continuous Conditional Generative Adversarial Networks

Diffusion-GAN: Training GANs with Diffusion

DuelGAN: A Duel Between Two Discriminators Stabilizes the GAN Training

Differentiable Augmentation for Data-Efficient GAN Training

Enhancing Stability in Training Conditional Generative Adversarial Networks via Selective Data Matching

Conditional Generation from Unconditional Diffusion Models using Denoiser Representations

DTGAN: Dual Attention Generative Adversarial Networks for Text-to-Image Generation

Conditional Image Synthesis With Auxiliary Classifier GANs

Enhancing Diffusion-Based Image Synthesis with Robust Classifier Guidance

D2C: Diffusion-Denoising Models for Few-shot Conditional Generation

DGattGAN: Cooperative Up-Sampling Based Dual Generator Attentional GAN on Text-to-Image Synthesis

D2PGGAN: Two Discriminators Used in Progressive Growing of GANS

Improving GANs with A Dynamic Discriminator

Entropy-Driven Sampling and Training Scheme for Conditional Diffusion Generation.