Prior knowledge guided text to image generation

An-An Liu,Zefang Sun,Ning Xu,Rongbao Kang,Jinbo Cao,Fan Yang,Weijun Qin,Shenyuan Zhang,Jiaqi Zhang,Xuanya Li
DOI: https://doi.org/10.1016/j.patrec.2023.12.003
IF: 4.757
2023-12-11
Pattern Recognition Letters
Abstract:Generating a realistic and semantically consistent image from a given text is a challenging task. Due to the limited information of natural language, it's difficult to generate vivid images with fine details. To address this problem, we propose a Prior Knowledge Guided GAN for text to image generation. Specifically, the proposed method consists of several Knowledge Guided Up-Blocks. We decompose the image into a superposition of several visual regions, each of which requires corresponding prior knowledge to enrich its visual details. Correspondingly, we construct each Up-Block by incorporating relevant prior knowledge as input, aiming to enhance the quality of each visual region. Prior knowledge progressively provides more visual detail through affine transformations. Finally, high-quality images are synthesized by fusing all image regions. Experimental results on the CUB and COCO datasets demonstrate the superior performance of the proposed method.
computer science, artificial intelligence
What problem does this paper attempt to address?