Abstract:Open-set face forgery detection poses significant security threats and presents substantial challenges for existing detection models. These detectors primarily have two limitations: they cannot generalize across unknown forgery domains and inefficiently adapt to new data. To address these issues, we introduce an approach that is both general and parameter-efficient for face forgery detection. It builds on the assumption that different forgery source domains exhibit distinct style statistics. Previous methods typically require fully fine-tuning pre-trained networks, consuming substantial time and computational resources. In turn, we design a forgery-style mixture formulation that augments the diversity of forgery source domains, enhancing the model's generalizability across unseen domains. Drawing on recent advancements in vision transformers (ViT) for face forgery detection, we develop a parameter-efficient ViT-based detection model that includes lightweight forgery feature extraction modules and enables the model to extract global and local forgery clues simultaneously. We only optimize the inserted lightweight modules during training, maintaining the original ViT structure with its pre-trained ImageNet weights. This training strategy effectively preserves the informative pre-trained knowledge while flexibly adapting the model to the task of Deepfake detection. Extensive experimental results demonstrate that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters, representing an important step toward open-set Deepfake detection in the wild.

What problem does this paper attempt to address?

The main problems that this paper attempts to solve are the insufficient generalization ability of existing deep - fake detection methods in unknown forgery domains and the low computational efficiency when adapting to new data. Specifically: 1. **Insufficient generalization ability**: When encountering unseen forgery domains, the performance of existing detection models will decline significantly. This is because these models usually cannot generalize well to new, unseen datasets. 2. **High computational cost for adapting to new data**: When fine - tuning is required for new data, existing detection models usually need to fully fine - tune the pre - trained network, which consumes a large amount of time and computational resources. To address these problems, the author proposes a general and parameter - efficient open - set deep - fake detection method (OSDFD). The main innovations of this method include: - **Forgery style mixing module**: By randomly mixing the feature statistics of different forgery styles, the diversity of the source forgery domain is enhanced, thereby improving the generalization ability of the model on unseen forgery domains. - **Parameter - efficient fine - tuning strategy**: Insert lightweight Adapter and LoRA layers into the pre - trained ViT model and only optimize these lightweight modules while keeping the original ViT structure and its pre - trained ImageNet weights. In this way, the pre - trained knowledge can be retained while flexibly adapting to the deep - fake detection task. - **Central difference convolution (CDC) adapter**: Introduce CDC operations to extract local forgery artifacts and enhance the detection performance of the model. Experimental results show that this method has achieved state - of - the - art generalization ability on multiple unseen deep - fake datasets and a significantly reduced number of trainable parameters, representing a significant progress in the field of open - set deep - fake detection.

Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture

MoE-FFD: Mixture of Experts for Generalized and Parameter-Efficient Face Forgery Detection

Generalized Face Forgery Detection via Adaptive Learning for Pre-trained Vision Transformer

Enhancing General Face Forgery Detection via Vision Transformer with Low-Rank Adaptation

Unified Video and Image Representation for Boosted Video Face Forgery Detection

Fake It till You Make It: Curricular Dynamic Forgery Augmentations towards General Deepfake Detection

Towards General Visual-Linguistic Face Forgery Detection.

DeepFake-Adapter: Dual-Level Adapter for DeepFake Detection

Fine-Grained Open-Set Deepfake Detection via Unsupervised Domain Adaptation

UniForensics: Face Forgery Detection via General Facial Representation

Advancing Generalized Deepfake Detector with Forgery Perception Guidance

Common Forgery Artifact Driven Deepfake Face Detection

Face Forgery Detection Algorithm Based on Improved MobileViT Network

Detection of Deepfake Videos Using Long-Distance Attention

Exposing the Deception: Uncovering More Forgery Clues for Deepfake Detection

Forgery-Domain-Supervised Deepfake Detection with Non-Negative Constraint.

Face Reconstruction-Based Generalized Deepfake Detection Model with Residual Outlook Attention

Lightweight detection method for deepfake face video

Generalizing Deepfake Video Detection with Plug-and-Play: Video-Level Blending and Spatiotemporal Adapter Tuning

Temporal Consistency Based Deep Face Forgery Detection Network.

Video Forgery Detection Using Spatio-Temporal Dual Transformer.