Detecting AI-Generated Images via CLIP

A.G. Moskowitz,T. Gaona,J. Peterson
2024-04-13
Abstract:As AI-generated image (AIGI) methods become more powerful and accessible, it has become a critical task to determine if an image is real or AI-generated. Because AIGI lack the signatures of photographs and have their own unique patterns, new models are needed to determine if an image is AI-generated. In this paper, we investigate the ability of the Contrastive Language-Image Pre-training (CLIP) architecture, pre-trained on massive internet-scale data sets, to perform this differentiation. We fine-tune CLIP on real images and AIGI from several generative models, enabling CLIP to determine if an image is AI-generated and, if so, determine what generation method was used to create it. We show that the fine-tuned CLIP architecture is able to differentiate AIGI as well or better than models whose architecture is specifically designed to detect AIGI. Our method will significantly increase access to AIGI-detecting tools and reduce the negative effects of AIGI on society, as our CLIP fine-tuning procedures require no architecture changes from publicly available model repositories and consume significantly less GPU resources than other AIGI detection models.
Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The problem this paper attempts to address is: how to effectively distinguish between real images and AI-generated images (AIGI). With the development and popularization of AI-generated image technology, this distinction has become increasingly important. AI-generated images lack the characteristics of traditional photographic images and have unique patterns, thus requiring new models to identify whether these images are AI-generated and, if so, to determine the specific generation method. Specifically, the paper explores the capability of using a pre-trained Contrastive Language-Image Pre-training (CLIP) architecture to accomplish this task. By fine-tuning the CLIP model on real images and AI images generated by various models, the researchers aim to achieve effective detection of AI-generated images and be able to identify the specific generation method. The goal of the paper is to improve the accessibility of AI-generated image detection tools, reduce the negative impact of AI-generated images on society, and lower the GPU resource consumption required by detection models.