Quality Prediction of AI Generated Images and Videos: Emerging Trends and Opportunities

Abhijay Ghildyal,Yuanhan Chen,Saman Zadtootaghaj,Nabajeet Barman,Alan C. Bovik
2024-10-20
Abstract:The advent of AI has influenced many aspects of human life, from self-driving cars and intelligent chatbots to text-based image and video generation models capable of creating realistic images and videos based on user prompts (text-to-image, image-to-image, and image-to-video). AI-based methods for image and video super resolution, video frame interpolation, denoising, and compression have already gathered significant attention and interest in the industry and some solutions are already being implemented in real-world products and services. However, to achieve widespread integration and acceptance, AI-generated and enhanced content must be visually accurate, adhere to intended use, and maintain high visual quality to avoid degrading the end user's quality of experience (QoE). One way to monitor and control the visual "quality" of AI-generated and -enhanced content is by deploying Image Quality Assessment (IQA) and Video Quality Assessment (VQA) models. However, most existing IQA and VQA models measure visual fidelity in terms of "reconstruction" quality against a pristine reference content and were not designed to assess the quality of "generative" artifacts. To address this, newer metrics and models have recently been proposed, but their performance evaluation and overall efficacy have been limited by datasets that were too small or otherwise lack representative content and/or distortion capacity; and by performance measures that can accurately report the success of an IQA/VQA model for "GenAI". This paper examines the current shortcomings and possibilities presented by AI-generated and enhanced image and video content, with a particular focus on end-user perceived quality. Finally, we discuss open questions and make recommendations for future work on the "GenAI" quality assessment problems, towards further progressing on this interesting and relevant field of research.
Computer Vision and Pattern Recognition,Image and Video Processing
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This pre - print paper aims to address the challenges faced in quality assessment of AI - generated and enhanced image and video content. Specifically, the paper focuses on the following key issues: 1. **Limitations of existing quality assessment models**: - **Traditional IQA (Image Quality Assessment) and VQA (Video Quality Assessment) models** are mainly based on reconstruction quality, that is, the difference from the original reference content, and these models are not suitable for assessing the quality of generative content. For example, AI - generated content may contain non - traditional distortions such as extra limbs or impossible animals, and these distortions cannot be measured by traditional "reconstruction" quality indicators. - **Lack of new metrics and models suitable for evaluating generative content**. Existing IQA/VQA methods cannot capture the unique artifacts in generative content (such as semantic errors, common - sense errors, etc.), and therefore cannot effectively predict user - perceived quality (QoE). 2. **Deficiencies in datasets and performance evaluation**: - **Small - scale or unrepresentative datasets** limit the performance evaluation of newly proposed IQA/VQA models. Existing datasets usually lack sufficient diversity or representativeness to comprehensively assess the quality of generative content. - **Lack of effective performance metrics** that can accurately report the success of IQA/VQA models for generative AI (GenAI) content. 3. **Unique challenges of generative content**: - **Visual quality and semantic accuracy**: AI - generated content not only needs to maintain high quality technically, but also needs to meet the requirements of input prompts and have correct semantics and common sense. For example, the generated image should correctly reflect the scene and details described in the input text. - **Complexity and diversity**: As the complexity and diversity of AI - generated content increase, how to ensure the authenticity and rationality of generative content has become an important research direction. 4. **Directions and suggestions for future research**: - **Innovate quality assessment models for generative content**: Develop new IQA/VQA models that can simultaneously evaluate technical quality, semantic correctness, bias, and aesthetics. - **Construct new datasets**: Create datasets of generative AI content labeled with subjective scores to support more extensive performance evaluation and model training. - **Interdisciplinary cooperation**: Combine knowledge from multiple fields such as computer vision, deep learning, and psychology to jointly promote the research on quality assessment of generative content. In summary, this paper emphasizes the existing problems in the current quality assessment of AI - generated and enhanced content and proposes potential directions for future research, hoping to promote the development of this field.