Investigation related to application of Generative Adversarial Networks in text-to-image synthesis
Yun Xu
DOI: https://doi.org/10.54254/2755-2721/55/20241100
2024-04-25
Abstract:Recent research attention has been captivated by the advent of Generative Adversarial Networks (GANs) in the realm of generating visuals from textual descriptions. Within a GAN framework, the interplay between the discriminator and generator components facilitates the production of lifelike visuals. This method proves to be versatile and user-friendly, allowing for the generation of authentic, diverse, and semantically faithful conditional images. However, the field still has to solve two issues: the development of high-resolution images with multiple elements and the construction of proper evaluation criteria that correlate with human perception. This paper contextualizes a number of adversarial text-to-image generation models and their core principles. This article engages in a comprehensive examination of the current methodologies employed in the analysis of text-to-image generation models, emphasizing their limitations and proposing avenues for future advancements. The discussion within this article centers on the utilization of generative adversarial networks in text-to-image synthesis, offering researchers both a comparative analysis and a benchmark for their text-to-image generation studies.
What problem does this paper attempt to address?