Large Multi-modality Model Assisted AI-Generated Image Quality Assessment

Puyi Wang,Wei Sun,Zicheng Zhang,Jun Jia,Yanwei Jiang,Zhichao Zhang,Xiongkuo Min,Guangtao Zhai

DOI: https://doi.org/10.1145/3664647.3681471

2024-08-03

Abstract:Traditional deep neural network (DNN)-based image quality assessment (IQA) models leverage convolutional neural networks (CNN) or Transformer to learn the quality-aware feature representation, achieving commendable performance on natural scene images. However, when applied to AI-Generated images (AGIs), these DNN-based IQA models exhibit subpar performance. This situation is largely due to the semantic inaccuracies inherent in certain AGIs caused by uncontrollable nature of the generation process. Thus, the capability to discern semantic content becomes crucial for assessing the quality of AGIs. Traditional DNN-based IQA models, constrained by limited parameter complexity and training data, struggle to capture complex fine-grained semantic features, making it challenging to grasp the existence and coherence of semantic content of the entire image. To address the shortfall in semantic content perception of current IQA models, we introduce a large Multi-modality model Assisted AI-Generated Image Quality Assessment (MA-AGIQA) model, which utilizes semantically informed guidance to sense semantic information and extract semantic vectors through carefully designed text prompts. Moreover, it employs a mixture of experts (MoE) structure to dynamically integrate the semantic information with the quality-aware features extracted by traditional DNN-based IQA models. Comprehensive experiments conducted on two AI-generated content datasets, AIGCQA-20k and AGIQA-3k show that MA-AGIQA achieves state-of-the-art performance, and demonstrate its superior generalization capabilities on assessing the quality of AGIs. Code is available at <a class="link-external link-https" href="https://github.com/wangpuyi/MA-AGIQA" rel="external noopener nofollow">this https URL</a>.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The paper aims to address the issue of quality assessment for AI-Generated Images (AGIs). Specifically, existing traditional image quality assessment (IQA) models based on deep neural networks (DNNs) perform well when dealing with natural scene images but are less effective in evaluating AGIs. This is mainly because AGIs contain some inherent semantic inaccuracies, and traditional DNN models, due to the complexity of parameters and limitations of training data, find it difficult to capture complex fine-grained semantic features, resulting in an inability to effectively assess the semantic content and consistency of the entire image. To solve this problem, the paper proposes a new framework—Multimodal Model-Assisted AI-Generated Image Quality Assessment (MA-AGIQA). This framework leverages the advantages of large multimodal models (LMMs) to extract fine-grained semantic features and uses carefully designed text prompts to guide the model to focus on the semantic information of the image. Additionally, the framework employs a Mixture of Experts (MoE) structure to dynamically combine these semantic features with quality-aware features extracted by traditional DNN models, thereby enhancing the ability to assess the quality of AGIs. Experimental results show that MA-AGIQA demonstrates superior performance across multiple datasets, especially in image assessment tasks where semantic quality is crucial.

Large Multi-modality Model Assisted AI-Generated Image Quality Assessment

MoE-AGIQA: Mixture-of-Experts Boosted Visual Perception-Driven and Semantic-Aware Quality Assessment for AI-Generated Images

Quality Assessment of AI-Generated Image Based on Cross-modal Correlation

Adaptive Mixed-Scale Feature Fusion Network for Blind AI-Generated Image Quality Assessment

AGIQA-3K: An Open Database for AI-Generated Image Quality Assessment

A Perceptual Quality Assessment Exploration for AIGC Images

AIGIQA-20K: A Large Database for AI-Generated Image Quality Assessment

AIGC Image Quality Assessment Via Image-Prompt Correspondence

AI-Generated Image Quality Assessment Based on Task-Specific Prompt and Multi-Granularity Similarity

PKU-I2IQA: An Image-to-Image Quality Assessment Database for AI Generated Images

CLIP-AGIQA: Boosting the Performance of AI-Generated Image Quality Assessment with CLIP

Quality Prediction of AI Generated Images and Videos: Emerging Trends and Opportunities

AI-generated Image Quality Assessment in Visual Communication

PKU-AIGIQA-4K: A Perceptual Quality Assessment Database for Both Text-to-Image and Image-to-Image AI-Generated Images

AIGCIQA2023: A Large-scale Image Quality Assessment Database for AI Generated Images: from the Perspectives of Quality, Authenticity and Correspondence

Benchmarking Multi-dimensional AIGC Video Quality Assessment: A Dataset and Unified Model

Exploring Rich Subjective Quality Information for Image Quality Assessment in the Wild

SGIQA: Semantic-Guided No-Reference Image Quality Assessment

AIGC-VQA: A Holistic Perception Metric for AIGC Video Quality Assessment

Bringing Textual Prompt to AI-Generated Image Quality Assessment