No-reference Quality Assessment of Text-to-Image Generation

Haitao Huang,Rongli Jia,Rong Xie,Li Song,Lin Li,Yanan Feng
DOI: https://doi.org/10.1109/bmsb62888.2024.10608192
2024-01-01
Abstract:This paper proposes a novel no-reference quality assessment method for text-to-image generation. Text-to-image refers to the process of generating image content from textual descriptions using deep learning models. Although advances in technology and improvements in models have made it possible to generate some high-quality images, some generated images still exhibit unique distortions that reflect the limitations of text-to-image generation models. Through in-depth analysis of existing assessment techniques and quality assessment datasets, we identified limitations in current image quality assessment methods when dealing with text-to-image. To address these challenges, we propose a new assessment approach that assesses the quality of text-to-image from three key dimensions: visual quality, authenticity, and text-image consistency. This approach not only focuses on the visual quality and authenticity of the images, but also emphasizes the consistency between image content and its corresponding textual description. Experimental results demonstrate that our model accurately captures the unique distortion features of text-to-image and effectively evaluates their quality from multiple dimensions. It provides a powerful tool for evaluating the quality of text-to-image and improving text-toimage generation techniques.
What problem does this paper attempt to address?