PCQA: A Strong Baseline for AIGC Quality Assessment Based on Prompt Condition

Xi Fang,Weigang Wang,Xiaoxin Lv,Jun Yan
2024-04-20
Abstract:The development of Large Language Models (LLM) and Diffusion Models brings the boom of Artificial Intelligence Generated Content (AIGC). It is essential to build an effective quality assessment framework to provide a quantifiable evaluation of different images or videos based on the AIGC technologies. The content generated by AIGC methods is driven by the crafted prompts. Therefore, it is intuitive that the prompts can also serve as the foundation of the AIGC quality assessment. This study proposes an effective AIGC quality assessment (QA) framework. First, we propose a hybrid prompt encoding method based on a dual-source CLIP (Contrastive Language-Image Pre-Training) text encoder to understand and respond to the prompt conditions. Second, we propose an ensemble-based feature mixer module to effectively blend the adapted prompt and vision features. The empirical study practices in two datasets: AIGIQA-20K (AI-Generated Image Quality Assessment database) and T2VQA-DB (Text-to-Video Quality Assessment DataBase), which validates the effectiveness of our proposed method: Prompt Condition Quality Assessment (PCQA). Our proposed simple and feasible framework may promote research development in the multimodal generation field.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve The paper aims to address the issue of quality assessment for AI-generated content (AIGC), specifically how to evaluate the artistic quality of AIGC content relatively objectively through machine learning models, and reduce subjectivity and bias in the evaluation process. ### Specific Goals 1. **Propose a Unified Framework**: Propose a unified framework for AIGC image and video quality assessment based on specific prompt conditions. 2. **High-level Semantic Information**: Focus on the high-level semantic information of AIGC works rather than low-level details. 3. **Feature Mixing Mechanism**: Design a mechanism based on feature adapters and feature mixers to enable effective interaction between prompt conditions and visual features. 4. **Reduce Scoring Bias**: Propose a novel ensemble method to mitigate scoring bias in the quality assessment process. ### Main Contributions 1. Proposed a unified framework based on prompt conditions for quality regression of AIGC images or videos. 2. Designed feature adapter and feature mixer mechanisms to achieve effective interaction between prompt conditions and visual features. 3. Proposed a new ensemble method to reduce scoring bias in quality assessment. ### Experimental Validation Experiments were conducted on two datasets, AIGIQA-20K (AI-Generated Image Quality Assessment Database) and T2VQA-DB (Text-to-Video Quality Assessment Database), to validate the effectiveness of the proposed method. Experimental results show that the method significantly outperforms baseline methods.