Abstract:Hyper-realistic avatars in the metaverse have already raised security concerns about deepfake techniques, deepfakes involving generated video “recording” may be mistaken for a real recording of the people it depicts. As a result, deepfake detection has drawn considerable attention in the multimedia forensic community. Though existing methods for deepfake detection achieve fairly good performance under the intra-dataset scenario, many of them gain unsatisfying results in the case of cross-dataset testing with more practical value, where the forged faces in training and testing datasets are from different domains. To tackle this issue, in this paper, we propose a novel Domain-Invariant and Patch-Discriminative feature learning framework - DI&PD. For image-level feature learning, a single-side adversarial domain generalization is introduced to eliminate domain variances and learn domain-invariant features in training samples from different manipulation methods, along with the global and local random crop augmentation strategy to generate more data views of forged images at various scales. A graph structure is then built by splitting the learned image-level feature maps, with each spatial location corresponding to a local patch, which facilitates patch representation learning by message-passing among similar nodes. Two types of center losses are utilized to learn more discriminative features in both image-level and patch-level embedding spaces. Extensive experimental results on several datasets demonstrate the effectiveness and generalization of the proposed method compared with other state-of-the-art methods.

What problem does this paper attempt to address?

The problem this paper attempts to address is the insufficient generalization ability of existing deepfake detection methods in cross-dataset testing scenarios. This is due to the fact that the forged faces in the training and testing datasets come from different domains. Specifically, while existing methods perform well within the same dataset, their performance significantly drops when dealing with forged faces generated by different synthesis algorithms, preprocessing methods, or attack techniques. To tackle this challenge, the authors propose a new Domain-Invariant and Patch-Discriminative feature learning framework (DI&PD), which aims to learn image-level domain-invariant features from forged faces synthesized by different algorithms and promote patch-level discriminative representation learning through a graph-based structure, thereby achieving better generalization ability in cross-dataset testing. The main contributions of the paper are as follows: 1. **Proposed a new Domain-Invariant and Patch-Discriminative feature learning framework (DI&PD)**: This framework excavates domain-invariant forged features through a one-sided adversarial domain generalization mechanism and generates multi-scale data views by combining global and local random cropping strategies. 2. **Constructed a graph-based structure**: The learned image-level features are segmented into multiple nodes, each corresponding to a local patch, and patch-level discriminative features are learned through message passing and feature transformation. 3. **Introduced patch-level and image-level center loss**: This further learns discriminative feature representations while reducing intra-class distance and increasing inter-class distance. 4. **Conducted extensive experiments**: Experiments were conducted on multiple benchmark datasets, and the results show that the proposed method outperforms existing deepfake detection methods in terms of generalization ability. Through these methods, the paper aims to improve the generalization ability and robustness of deepfake detection in cross-dataset testing.

Domain-invariant and Patch-discriminative Feature Learning for General Deepfake Detection

Selective Domain-Invariant Feature for Generalizable Deepfake Detection

Patch-DFD: Patch-based end-to-end DeepFake discriminator

Refining Localized Attention Features with Multi-Scale Relationships for Enhanced Deepfake Detection in Spatial-Frequency Domain

Improving Deepfake Detection Generalization by Invariant Risk Minimization

Multi-feature fusion based face forgery detection with local and global characteristics

DomainForensics: Exposing Face Forgery across Domains via Bi-directional Adaptation

Unearthing Common Inconsistency for Generalisable Deepfake Detection

Preserving Fairness Generalization in Deepfake Detection

Learning a Deep Dual-Level Network for Robust DeepFake Detection

DiffusionFake: Enhancing Generalization in Deepfake Detection via Guided Stable Diffusion

GM-DF: Generalized Multi-Scenario Deepfake Detection

DFIL: Deepfake Incremental Learning by Exploiting Domain-invariant Forgery Clues

MD-CSDNetwork: Multi-Domain Cross Stitched Network for Deepfake Detection

FFR_FD: Effective and Fast Detection of DeepFakes Based on Feature Point Defects

Manipulation-Invariant Fingerprints for Cross-Dataset Deepfake Detection.

FFR_FD: Effective and fast detection of DeepFakes via feature point defects

Jointly learning and training: using style diversification to improve domain generalization for deepfake detection

A defensive framework for deepfake detection under adversarial settings using temporal and spatial features

MMD Based Discriminative Learning for Face Forgery Detection

Unmasking DeepFakes with simple Features