Abstract:As image generation technology advances, AI-based image generation has been applied in various fields and Artificial Intelligence Generated Content (AIGC) has garnered widespread attention. However, the development of AI-based image generative models also brings new problems and challenges. A significant challenge is that AI-generated images (AIGI) may exhibit unique distortions compared to natural images, and not all generated images meet the requirements of the real world. Therefore, it is of great significance to evaluate AIGIs more comprehensively. Although previous work has established several human perception-based AIGC image quality assessment (AIGCIQA) databases for text-generated images, the AI image generation technology includes scenarios like text-to-image and image-to-image, and assessing only the images generated by text-to-image models is insufficient. To address this issue, we establish a human perception-based image-to-image AIGCIQA database, named PKU-I2IQA. We conduct a well-organized subjective experiment to collect quality labels for AIGIs and then conduct a comprehensive analysis of the PKU-I2IQA database. Furthermore, we have proposed two benchmark models: NR-AIGCIQA based on the no-reference image quality assessment method and FR-AIGCIQA based on the full-reference image quality assessment method. Finally, leveraging this database, we conduct benchmark experiments and compare the performance of the proposed benchmark models. The PKU-I2IQA database and benchmarks will be released to facilitate future research on \url{<a class="link-external link-https" href="https://github.com/jiquan123/I2IQA" rel="external noopener nofollow">this https URL</a>}.

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve This paper aims to address the issue of quality assessment for AI-generated images (AIGI). With the development of AI image generation technology, AI-generated images have been widely used in various fields, but these images may have unique distortion phenomena that do not fully meet real-world requirements. Therefore, a comprehensive evaluation of the quality of AI-generated images has become particularly important. Currently, there are some AI-generated content (AIGC) image quality assessment (AIGCIQA) databases based on human perception, but they mainly focus on text-to-image generation models, neglecting the diversity of image-to-image generation techniques. This has led to a critical gap in current research, namely the lack of databases specifically for image-to-image generation scenarios. To fill this gap, the authors have established the first human perception-based image-to-image AIGCIQA database, named PKU-I2IQA. Additionally, the authors propose two benchmark models: NR-AIGCIQA based on no-reference image quality assessment methods and FR-AIGCIQA based on full-reference image quality assessment methods. Through this database, the authors conducted benchmark experiments and compared the performance of these two benchmark models. ### Main Contributions 1. **Establishment of the first human perception-based image-to-image AIGCIQA database**: PKU-I2IQA. 2. **Proposal of two benchmark models**: NR-AIGCIQA based on no-reference image quality assessment methods and FR-AIGCIQA based on full-reference image quality assessment methods. 3. **Conducting benchmark experiments**: Evaluating and comparing the performance of the proposed benchmark models on the PKU-I2IQA database. ### Method Overview - **Database Construction**: 200 categories were selected from ImageNet, and corresponding high-resolution images were collected as image prompts. Midjourney and Stable Diffusion V1.5 were used to generate images. Each image prompt generated 4 images, resulting in a total of 1600 images. - **Subjective Experiments**: Subjective experiments were organized to collect image quality labels, evaluating from three dimensions: quality, realism, and text-image correspondence. - **Benchmark Models**: Two benchmark models were proposed, based on no-reference and full-reference image quality assessment methods, respectively. Pre-trained backbone networks were used to extract features, and a regression network was used to predict image quality scores. ### Experimental Results - **Performance Comparison**: The performance of the FR-AIGCIQA benchmark model was superior to that of the NR-AIGCIQA benchmark model. - **Best Performance**: Among the backbone networks used, ResNet18 performed best in terms of quality and correspondence, ResNet50 performed best in terms of final score, and InceptionV4 performed best in terms of realism. ### Conclusion Although the proposed benchmark models exhibit certain performance, there is still much room for improvement in designing AIGCIQA models. Future research will focus on how to introduce reference images in text-to-image generation scenarios without image prompts to improve model performance. Additionally, the authors conducted cross-model evaluation experiments, and the results showed that the proposed benchmark models have weak generalization capabilities across different generation models.

PKU-I2IQA: An Image-to-Image Quality Assessment Database for AI Generated Images

PKU-AIGIQA-4K: A Perceptual Quality Assessment Database for Both Text-to-Image and Image-to-Image AI-Generated Images

AGIQA-3K: An Open Database for AI-Generated Image Quality Assessment

AIGIQA-20K: A Large Database for AI-Generated Image Quality Assessment

AIGCIQA2023: A Large-scale Image Quality Assessment Database for AI Generated Images: from the Perspectives of Quality, Authenticity and Correspondence

A Perceptual Quality Assessment Exploration for AIGC Images

Subjective and Objective Quality Assessment for in-the-Wild Computer Graphics Images

Subjective Quality Assessment for Images Generated by Computer Graphics

AIGCOIQA2024: Perceptual Quality Assessment of AI Generated Omnidirectional Images

Large Multi-modality Model Assisted AI-Generated Image Quality Assessment

PKU-AIGI-500K: A Neural Compression Benchmark and Model for AI-Generated Images

PIPAL: A Large-Scale Image Quality Assessment Dataset for Perceptual Image Restoration

SF-IQA: Quality and Similarity Integration for AI Generated Image Quality Assessment

Going the Extra Mile in Face Image Quality Assessment: A Novel Database and Model

A survey on IQA

Generalized Visual Quality Assessment of GAN-Generated Face Images

Cuid: A new study of perceived image quality and its subjective assessment

AI-Generated Image Quality Assessment Based on Task-Specific Prompt and Multi-Granularity Similarity

CLIP-AGIQA: Boosting the Performance of AI-Generated Image Quality Assessment with CLIP

Quality Prediction of AI Generated Images and Videos: Emerging Trends and Opportunities