Multi-Semantic Fusion Generative Adversarial Network for Text-to-Image Generation

Chunjiang Fu,Liang Zhao,Pingda Huang,Yedan Liu
DOI: https://doi.org/10.1109/ICBDA57405.2023.10104850
2023-03-03
Abstract:Text-to-image synthesis always faces two challenges: image quality and image-text alignment. Existing methods mainly use a single sentence to synthesize images, which are challenging to extract adequate semantic features, resulting in the generated images being far apart from ground-truth images. In this paper, we propose a novel Multi-Semantic Fusion Generative Adversarial Network. Our model can fuse the same semantics from different sentences and preserve their unique semantics to generate accurate images. In addition, we have designed a multi-sentence joint discriminator to ensure that the generated images match all sentences. Experiments on CUB and MSCOCO datasets demonstrate that our model has significant advantages.
Computer Science
What problem does this paper attempt to address?