AIGCBench: Comprehensive Evaluation of Image-to-Video Content Generated by AI

Fanda Fan,Chunjie Luo,Wanling Gao,Jianfeng Zhan
2024-01-23
Abstract:The burgeoning field of Artificial Intelligence Generated Content (AIGC) is witnessing rapid advancements, particularly in video generation. This paper introduces AIGCBench, a pioneering comprehensive and scalable benchmark designed to evaluate a variety of video generation tasks, with a primary focus on Image-to-Video (I2V) generation. AIGCBench tackles the limitations of existing benchmarks, which suffer from a lack of diverse datasets, by including a varied and open-domain image-text dataset that evaluates different state-of-the-art algorithms under equivalent conditions. We employ a novel text combiner and GPT-4 to create rich text prompts, which are then used to generate images via advanced Text-to-Image models. To establish a unified evaluation framework for video generation tasks, our benchmark includes 11 metrics spanning four dimensions to assess algorithm performance. These dimensions are control-video alignment, motion effects, temporal consistency, and video quality. These metrics are both reference video-dependent and video-free, ensuring a comprehensive evaluation strategy. The evaluation standard proposed correlates well with human judgment, providing insights into the strengths and weaknesses of current I2V algorithms. The findings from our extensive experiments aim to stimulate further research and development in the I2V field. AIGCBench represents a significant step toward creating standardized benchmarks for the broader AIGC landscape, proposing an adaptable and equitable framework for future assessments of video generation tasks. We have open-sourced the dataset and evaluation code on the project website:
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
The paper introduces AIGCBench, a comprehensive and scalable benchmark evaluation system for Image-to-Video (I2V) generation tasks in the domain of Artificial Intelligence Generated Content (AIGC). Currently, the AIGC field, particularly in video generation, is rapidly developing but lacks a comprehensive benchmark for evaluating various video generation tasks, especially I2V generation. AIGCBench aims to address this issue by providing diverse image-text datasets, creating rich text prompts using an innovative text combiner and GPT-4, and utilizing advanced text-to-image models for image generation. AIGCBench consists of three modules: evaluation datasets, evaluation metrics, and the video generation models to be evaluated. It proposes a set of 11 evaluation metrics covering four dimensions, namely, video alignment control, motion effects, temporal consistency, and video quality. These metrics include both reference-based metrics and reference-free metrics to achieve comprehensive evaluation. Additionally, AIGCBench incorporates human validation to confirm the effectiveness of its evaluation criteria. Compared to existing benchmarks, AIGCBench is more comprehensive and allows for the comparison of different algorithms under equivalent conditions, facilitating analysis of the strengths and limitations of different video generation algorithms and driving progress in the I2V field. The paper also compares AIGCBench with other benchmarks and presents detailed contributions, including creating high-quality images, expanding image-text datasets, and comprehensive evaluation of I2V algorithms. Overall, AIGCBench is a standardized benchmark for video generation tasks, aiming to promote evaluation and future development in the field of AIGC, particularly in I2V generation tasks.