AIGC-VQA: A Holistic Perception Metric for AIGC Video Quality Assessment

Zihao Yu,Ruling Liao,Yan Ye,Zhibo Chen,Yiting Lu,Fengbin Guan,Bingchen Li,Xin Li,Xinrui Wang
DOI: https://doi.org/10.1109/CVPRW63382.2024.00640
2024-06-17
Abstract:With the development of generative models, such as the diffusion model, and auto-regressive model, AI-generated content (AIGC) is experiencing an explosive growth. Moreover, existing quality metrics extracted from fixed pre-trained models struggle to align accurately with human perception. There is an urgent need for an adaptive metric capable of gauging the multiple critical factors (i.e., technical quality, aesthetic quality, and video-text alignment) related to quality within AIGC videos, to provide quality assessment and guide optimization of generative models. In this work, we propose a holistic metric for AIGC video quality assessment, termed AIGC-VQA, which contains three functional branches for the cooperation on technical, aesthetic, and video-text alignment aspects in AIGC videos. Specifically, to efficiently transfer the knowledge of image-text alignment to the video-text alignment, we introduce the spatial-temporal adapter to exploit the pre-trained prior from a large-scale image-text model and achieve the temporal knowledge adaptation. Besides, we propose a divide-and-conquer training strategy for progressive cooperation on multiple branches. Due to the holistic perception ability, our proposed AIGC-VQA obtains state-of-the-art results on the T2VQA-DB dataset.
Computer Science
What problem does this paper attempt to address?