Selective Domain-Invariant Feature for Generalizable Deepfake Detection

Yingxin Lai,Guoqing Yang Yifan He,Zhiming Luo,Shaozi Li
2024-03-19
Abstract:With diverse presentation forgery methods emerging continually, detecting the authenticity of images has drawn growing attention. Although existing methods have achieved impressive accuracy in training dataset detection, they still perform poorly in the unseen domain and suffer from forgery of irrelevant information such as background and identity, affecting generalizability. To solve this problem, we proposed a novel framework Selective Domain-Invariant Feature (SDIF), which reduces the sensitivity to face forgery by fusing content features and styles. Specifically, we first use a Farthest-Point Sampling (FPS) training strategy to construct a task-relevant style sample representation space for fusing with content features. Then, we propose a dynamic feature extraction module to generate features with diverse styles to improve the performance and effectiveness of the feature extractor. Finally, a domain separation strategy is used to retain domain-related features to help distinguish between real and fake faces. Both qualitative and quantitative results in existing benchmarks and proposals demonstrate the effectiveness of our approach.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper attempts to address the issue of poor generalization performance in detecting Deepfake images across different domains (i.e., different datasets or forgery methods). Although existing methods can achieve high accuracy on the training dataset, they perform poorly on unseen datasets and are easily influenced by irrelevant information such as background and identity, leading to insufficient generalization capability. To tackle this challenge, the authors propose a new framework—Selective Domain-Invariant Feature (SDIF), which reduces sensitivity to facial forgery by integrating content features and style features. Specifically, this framework includes the following key modules: 1. **Farthest-Point Sampling (FPS)**: Used to construct a task-related style sample representation space, making the style samples as dispersed and uniform as possible. 2. **Dynamic Feature Extractor (DFE)**: Generates features with diverse styles to improve the performance and effectiveness of the feature extractor. 3. **Domain Separation Strategy**: Retains domain-related features to help distinguish between real and fake faces. Through these modules, the SDIF framework aims to improve the generalization performance of cross-domain detection, and experimental results on multiple benchmark datasets demonstrate its effectiveness.