Abstract:The extraordinary ability of generative models enabled the generation of images with such high quality that human beings cannot distinguish Artificial Intelligence (AI) generated images from real-life photographs. The development of generation techniques opened up new opportunities but concurrently introduced potential risks to privacy, authenticity, and security. Therefore, the task of detecting AI-generated imagery is of paramount importance to prevent illegal activities. To assess the generalizability and robustness of AI-generated image detection, we present a large-scale dataset, referred to as WildFake, comprising state-of-the-art generators, diverse object categories, and real-world applications. WildFake dataset has the following advantages: 1) Rich Content with Wild collection: WildFake collects fake images from the open-source community, enriching its diversity with a broad range of image classes and image styles. 2) Hierarchical structure: WildFake contains fake images synthesized by different types of generators from GANs, diffusion models, to other generative models. These key strengths enhance the generalization and robustness of detectors trained on WildFake, thereby demonstrating WildFake's considerable relevance and effectiveness for AI-generated detectors in real-world scenarios. Moreover, our extensive evaluation experiments are tailored to yield profound insights into the capabilities of different levels of generative models, a distinctive advantage afforded by WildFake's unique hierarchical structure.

What problem does this paper attempt to address?

This paper focuses on how to detect images generated by artificial intelligence, especially those high-quality images produced by various advanced generative models such as Generative Adversarial Networks (GANs) and Diffusion Models (DMs). As these technologies continue to advance, the recognition of forged images becomes increasingly important because they can be used to spread false information and influence public opinion. The current detection methods have limited effectiveness in dealing with unseen generative models. To overcome this challenge, the paper proposes a large-scale dataset called WildFake, which contains diverse and high-quality forged images from the open-source community, covering various image categories and styles. The dataset is characterized by a rich hierarchy, including different types of generators, different architectures, personalized weights, and different versions of the same model series. This design allows the detector to have better generalization ability and robustness after training, to adapt to the complex and diverse situations in the real world. Compared to existing datasets, the WildFake dataset has significant advantages because it is not limited to one or two generators and includes a wider range of categories and high-quality user-generated images. The paper evaluates the performance of detectors trained on WildFake through a series of experiments and tests their robustness under degradation conditions. In addition, the unique hierarchical structure of WildFake enables in-depth analysis of the capabilities of different levels of generators. In summary, the goal of this paper is to promote the development of more effective techniques for detecting artificially generated images by creating the WildFake dataset, in order to address the challenges brought by constantly evolving forgery image generation technologies.

WildFake: A Large-scale Challenging Dataset for AI-Generated Images Detection

GenImage: A Million-Scale Benchmark for Detecting AI-Generated Image

No One Can Escape: A General Approach to Detect Tampered and Generated Image

WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection

Let Real Images be as a Judger, Spotting Fake Images Synthesized with Generative Models

Community Forensics: Using Thousands of Generators to Train Fake Image Detectors

Finding AI-Generated Faces in the Wild

Distinguish Any Fake Videos: Unleashing the Power of Large-scale Data and Motion Features

Fusing Global and Local Features for Generalized AI-Synthesized Image Detection

DeepfakeArt Challenge: A Benchmark Dataset for Generative AI Art Forgery and Data Poisoning Detection

Towards More Accurate Fake Detection on Images Generated from Advanced Generative and Neural Rendering Models

PatchCraft: Exploring Texture Patch for Efficient AI-generated Image Detection

AI vs. AI: Can AI Detect AI-Generated Images?

The Adversarial AI-Art: Understanding, Generation, Detection, and Benchmarking

Generalized Fake Image Detection Method Based on Gated Hierarchical Multi-Task Learning

Harnessing Machine Learning for Discerning AI-Generated Synthetic Images

An Automatic System for Generating Artificial Fake Character Images.

DETER: Detecting Edited Regions for Deterring Generative Manipulations

Detection of AI-Generated Synthetic Images with a Lightweight CNN

RU-AI: A Large Multimodal Dataset for Machine Generated Content Detection