AI-Generated Video Content Detection Using Vision Language Models

Praveen Tirupattur,Keerthi Veeramachaneni,A. S. Bedi,Mubarak Shah
Abstract:Significant advances have been made in the field of AI-generated content creation, ranging from text to audio to photos to, more recently, video. The latest state-of-the-art (SOTA) models have the capability to create content nearly indistinguishable from those made by humans. The spread of AI-generated content produces several risks such as mis-information, privacy, security challenges, and more. We introduce an approach to detect AI-generated video content from real videos through the use of pre-trained vision language models. These models, having been trained on a great amount of real image and video content, possess a signal that allows them to separate between real and “fake” videos. Using video features extracted from these models, we take both a training-free and training-based approach. Both methods give us high accuracy in detection of open-source and SOTA AI-generated video content, on par with other works in the field while cutting down on computational and time costs, while also utilizing a smaller reference/training dataset. On average, we get a 90 percent F1 score when detecting real and fake videos across all models. We also aim to contribute a new dataset consisting of AI-generated and real videos to further advance research in this area.
Computer Science,Engineering
What problem does this paper attempt to address?