RIGID: A Training-free and Model-Agnostic Framework for Robust AI-Generated Image Detection

Zhiyuan He,Pin-Yu Chen,Tsung-Yi Ho
2024-05-30
Abstract:The rapid advances in generative AI models have empowered the creation of highly realistic images with arbitrary content, raising concerns about potential misuse and harm, such as Deepfakes. Current research focuses on training detectors using large datasets of generated images. However, these training-based solutions are often computationally expensive and show limited generalization to unseen generated images. In this paper, we propose a training-free method to distinguish between real and AI-generated images. We first observe that real images are more robust to tiny noise perturbations than AI-generated images in the representation space of vision foundation models. Based on this observation, we propose RIGID, a training-free and model-agnostic method for robust AI-generated image detection. RIGID is a simple yet effective approach that identifies whether an image is AI-generated by comparing the representation similarity between the original and the noise-perturbed counterpart. Our evaluation on a diverse set of AI-generated images and benchmarks shows that RIGID significantly outperforms existing trainingbased and training-free detectors. In particular, the average performance of RIGID exceeds the current best training-free method by more than 25%. Importantly, RIGID exhibits strong generalization across different image generation methods and robustness to image corruptions.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper focuses on how to detect images generated by artificial intelligence without training and without relying on specific models. With the progress of deep learning in image generation, highly realistic images have been generated, but they also bring risks of misuse, such as Deepfakes. Existing methods usually require a large dataset of real and generated images to train detectors, but this approach is expensive and difficult to generalize to unseen generated images. The paper proposes a new method called RIGID, which does not require training and does not depend on specific models. RIGID is based on the observation that real images are more robust to small noise perturbations in the representation space of visual basic models compared to AI-generated images. Leveraging this characteristic, RIGID identifies whether an image is AI-generated by comparing the similarity of representations between the original image and the image perturbed by noise. Experiments show that RIGID outperforms existing trained and untrained detectors in various AI-generated image datasets and benchmark tests, and it has the ability to generalize across different generation methods and robustness against image corruption. In summary, the paper aims to address the problem of efficiently and untrainedly detecting images generated by artificial intelligence to reduce potential misuse risks and improve the performance and robustness of detection.