Abstract:Due to the high potential for abuse of GenAI systems, the task of detecting synthetic images has recently become of great interest to the research community. Unfortunately, existing image-space detectors quickly become obsolete as new high-fidelity text-to-image models are developed at blinding speed. In this work, we propose a new synthetic image detector that uses features obtained by inverting an open-source pre-trained Stable Diffusion model. We show that these inversion features enable our detector to generalize well to unseen generators of high visual fidelity (e.g., DALL-E 3) even when the detector is trained only on lower fidelity fake images generated via Stable Diffusion. This detector achieves new state-of-the-art across multiple training and evaluation setups. Moreover, we introduce a new challenging evaluation protocol that uses reverse image search to mitigate stylistic and thematic biases in the detector evaluation. We show that the resulting evaluation scores align well with detectors' in-the-wild performance, and release these datasets as public benchmarks for future research.
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is: **How to detect synthetic images generated by unseen text - to - image generation models (such as DALLĀ·E 3, Imagen, etc.)**. Specifically, the author proposes a new method. By using the inverted features of the Stable Diffusion model to train a detector, it can be made to recognize fake images generated by other high - fidelity text - to - image generation models.
### Main problems and challenges
1. **Limitations of existing detectors**:
- Existing image - space detectors experience a rapid decline in performance when faced with newly developed high - fidelity text - to - image models.
- These detectors usually rely on specific generation models or datasets and are difficult to generalize to unseen generators.
2. **The risk of abuse of high - fidelity generation models**:
- With the development of text - to - image models, the risk of generating harmful or misleading content increases.
- Commercial models are constantly updated and have closed - source code, making it more difficult to keep fake - image detectors up - to - date.
### Proposed solutions
1. **Introduce a new synthetic - image detector**:
- This detector uses the inverted features of the Stable Diffusion model (inverted latent noise map and reconstructed input image) as additional input signals.
- Through these signals, the detector can still effectively detect images generated by higher - fidelity generation models even when trained only with lower - fidelity fake images.
2. **Propose a new evaluation protocol SynRIS**:
- To ensure that the detector is not biased towards specific topics or styles, the author introduces a new evaluation protocol based on Reverse Image Search (RIS).
- This protocol ensures the consistency of topics and styles during the evaluation process by matching the generated fake images with real images found on the Internet.
### Formula explanation
- **DDIM inversion process**:
\[
z_{t + 1}=\sqrt{\bar{\alpha}_{t + 1}} f_\theta(z_t, t, c)+\sqrt{1-\bar{\alpha}_{t + 1}} \epsilon_\theta(z_t, t, c)
\]
where \( z_t \) is the noisy latent variable at time \( t \), \( c \) is the conditional vector, \( \bar{\alpha} \) is the DDIM noise scaling factor, \( \epsilon_\theta \) is the learned denoising function, and \( f_\theta \) is the current best estimate of the clean latent variable.
- **Reconstruction process**:
\[
\hat{z}_{t - 1}=\sqrt{\bar{\alpha}_{t - 1}} f_\theta(\hat{z}_t, t, c)+\sqrt{1-\bar{\alpha}_{t - 1}} \epsilon_\theta(\hat{z}_t, t, c)
\]
- **Final prediction**:
\[
\phi^*=\arg\min_\phi \mathbb{E}_{x,y}[\ell(h_\phi(x, D(\hat{z}_T), D(\hat{z}_0)), y)]
\]
where \( h_\phi \) is the mapping function for prediction, and \( \ell \) is the binary cross - entropy loss function.
### Summary
The main contributions of the paper are:
1. Proposing a new synthetic - image detector, using the inverted features of Stable Diffusion to improve the detection ability for unseen generation models.
2. Introducing a new evaluation protocol SynRIS to ensure that the detector is not biased towards specific topics or styles.
3. Verifying the effectiveness and reliability of this method through experiments and releasing an evaluation benchmark for future research.