Abstract:Learning intrinsic bias from limited data has been considered the main reason for the failure of deepfake detection with generalizability. Apart from the discovered content and specific-forgery bias, we reveal a novel spatial bias, where detectors inertly anticipate observing structural forgery clues appearing at the image center, also can lead to the poor generalization of existing methods. We present ED$^4$, a simple and effective strategy, to address aforementioned biases explicitly at the data level in a unified framework rather than implicit disentanglement via network design. In particular, we develop ClockMix to produce facial structure preserved mixtures with arbitrary samples, which allows the detector to learn from an exponentially extended data distribution with much more diverse identities, backgrounds, local manipulation traces, and the co-occurrence of multiple forgery artifacts. We further propose the Adversarial Spatial Consistency Module (AdvSCM) to prevent extracting features with spatial bias, which adversarially generates spatial-inconsistent images and constrains their extracted feature to be consistent. As a model-agnostic debiasing strategy, ED$^4$ is plug-and-play: it can be integrated with various deepfake detectors to obtain significant benefits. We conduct extensive experiments to demonstrate its effectiveness and superiority over existing deepfake detection approaches.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is the insufficient generalization ability of models in deepfake detection. Specifically, existing deepfake detectors perform poorly when dealing with unseen data distributions, mainly because they learn internal biases from limited data, resulting in the following three biases: 1. **Content Bias**: Detectors may wrongly rely on specific identity or background information to judge the authenticity of an image, rather than focusing on the characteristics of forgery. 2. **Specific - Forgery Bias**: Detectors tend to focus on artifacts related to specific forgery methods, while ignoring common forgery features. 3. **Spatial Bias**: Detectors usually expect to observe structured forgery clues in the center of the image, without considering the actual face position or the existence of local forgery artifacts. To solve the above problems, the authors propose a framework named **ED4: Explicit Data - level Debiasing for Deepfake Detection**, aiming to improve the generalization ability of deepfake detectors through an explicit data - level debiasing method. Specifically, ED4 contains two main modules: - **ClockMix**: By splicing the faces and backgrounds in different images based on sectors, it generates mixed images containing different identities, backgrounds, and forgery traces, in order to break the content bias and specific - forgery bias. - **Adversarial Spatial Consistency Module (AdvSCM)**: By introducing an adversarial generator to generate spatially inconsistent images and constraining the extracted features to be consistent, it prevents the detector from learning spatial bias. ### Main Contributions 1. **Explicit Debiasing**: Unlike existing methods that implicitly debias through network design, ED4 explicitly removes model biases through data augmentation. 2. **Flexibility and Effectiveness**: ClockMix and AdvSCM can be flexibly applied to various deepfake detectors, significantly improving their generalization ability and robustness. 3. **Experimental Verification**: Through extensive experimental verification, ED4 outperforms existing methods on multiple benchmark datasets, especially in cross - dataset evaluation. ### Formula Summary - **ClockMix Mixed Image Formula**: \[ I_{ab}=\text{ClockMix}(I_a, I_b, \rho_1)=I_a\odot(M_{\text{base}} > \rho_1)+I_b\odot(M_{\text{base}}\leq\rho_1) \] where $M_{\text{base}}=(M - \rho_{\text{base}})\bmod 360$, and $M(i, j)=\left(\frac{180}{\pi}\arctan2(\delta_y - i, j - \delta_x)\right)\bmod 360$. - **Mixed Image Label Formula**: \[ y_{ab}=1-(1 - y_a)(1 - y_b) \] - **Adversarial Spatial Consistency Module Optimization Objective**: \[ \theta_e'=\arg\min_{\theta_e}D,\quad\text{where}\quad D = L_1(F_{s1}, F_{s2}) \] \[ \theta_a'=\arg\max_{\theta_a}D \] - **Detection Loss Formula**: \[ L_d(y', y)=-[y\log(y')+(1 - y)\log(1 - y')] \] Through the application of these methods and formulas.

ED$^4$: Explicit Data-level Debiasing for Deepfake Detection

Unearthing Common Inconsistency for Generalisable Deepfake Detection

A3:Ambiguous Aberrations Captured via Astray-Learning for Facial Forgery Semantic Sublimation

Diff-ID: An Explainable Identity Difference Quantification Framework for DeepFake Detection

Transcending Forgery Specificity with Latent Space Augmentation for Generalizable Deepfake Detection

Selective Domain-Invariant Feature for Generalizable Deepfake Detection

Exposing the Deception: Uncovering More Forgery Clues for Deepfake Detection

Fake It till You Make It: Curricular Dynamic Forgery Augmentations towards General Deepfake Detection

Capture Artifacts via Progressive Disentangling and Purifying Blended Identities for Deepfake Detection

AVoiD-DF: Audio-Visual Joint Learning for Detecting Deepfake

Delving into the Local: Dynamic Inconsistency Learning for DeepFake Video Detection

Improving Deepfake Detection Generalization by Invariant Risk Minimization

DFCP: Few-Shot DeepFake Detection via Contrastive Pretraining

Dynamic Difference Learning with Spatio-temporal Correlation for Deepfake Video Detection

$\textit{X}^2$-DFD: A framework for e${X}$plainable and e${X}$tendable Deepfake Detection

Generalizing Deepfake Video Detection with Plug-and-Play: Video-Level Blending and Spatiotemporal Adapter Tuning

Exploiting Complementary Dynamic Incoherence for DeepFake Video Detection

Exploring varying color spaces through representative forgery learning to improve deepfake detection

A defensive framework for deepfake detection under adversarial settings using temporal and spatial features

TCSD: Triple Complementary Streams Detector for Comprehensive Deepfake Detection

UCF: Uncovering Common Features for Generalizable Deepfake Detection