Discriminator Modification in GAN for Text-to-Image Generation

Fei Fang,Ziqing Li,Fei Luo,Chunxia Xiao
DOI: https://doi.org/10.1109/icme52920.2022.9859825
2022-01-01
Abstract:The existing Generative Adversarial Network-based text-to-image generation methods suffer from mode collapse and training instability. This paper relieves these problems by improving the discriminator ability from three aspects. First, we propose a diversity-sensitive conditional discriminator (D-SCD), which increases the diversity of the generated images by judging the combination of the generated image and mismatched text as false. Second, for the unconditional discriminator, we propose a contrastive searching gradient penalty (CSGP) strategy to measure the realism of the generated images and to penalize the gradients for stabilizing the training process. Finally, we introduce a multi-level images similarity (MLIS) loss for the discriminator feature extractor to further promote the high-level feature similarity between the real and generated images and objects. Extensive experimental results and ablation studies demonstrate that our modifications on the discriminators can effectively improve the quality of the generated images.
What problem does this paper attempt to address?