Abstract:Previous deepfake detection methods mostly depend on low-level textural features vulnerable to perturbations and fall short of detecting unseen forgery methods. In contrast, high-level semantic features are less susceptible to perturbations and not limited to forgery-specific artifacts, thus having stronger generalization. Motivated by this, we propose a detection method that utilizes high-level semantic features of faces to identify inconsistencies in temporal domain. We introduce UniForensics, a novel deepfake detection framework that leverages a transformer-based video classification network, initialized with a meta-functional face encoder for enriched facial representation. In this way, we can take advantage of both the powerful spatio-temporal model and the high-level semantic information of faces. Furthermore, to leverage easily accessible real face data and guide the model in focusing on spatio-temporal features, we design a Dynamic Video Self-Blending (DVSB) method to efficiently generate training samples with diverse spatio-temporal forgery traces using real facial videos. Based on this, we advance our framework with a two-stage training approach: The first stage employs a novel self-supervised contrastive learning, where we encourage the network to focus on forgery traces by impelling videos generated by the same forgery process to have similar representations. On the basis of the representation learned in the first stage, the second stage involves fine-tuning on face forgery detection dataset to build a deepfake detector. Extensive experiments validates that UniForensics outperforms existing face forgery methods in generalization ability and robustness. In particular, our method achieves 95.3\% and 77.2\% cross dataset AUC on the challenging Celeb-DFv2 and DFDC respectively.

Protecting World Leader Using Facial Speaking Pattern Against Deepfakes

Spatial-temporal Transformer Network for Protecting Person-of-interest from Deepfaking

Protecting World Leaders Against Deep Fakes

Restore DeepFakes Video Frames Via Identifying Individual Motion Styles

Building an Invisible Shield for Your Portrait against Deepfakes

FaceShield: Defending Facial Image against Deepfake Threats

Protecting Celebrities from DeepFake with Identity Consistency Transformer

Facial Features Matter: a Dynamic Watermark based Proactive Deepfake Detection Approach

Defending Fake via Warning: Universal Proactive Defense Against Face Manipulation.

FTFDNet: Learning to Detect Talking Face Video Manipulation with Tri-Modality Interaction

Face Forgery Detection Based on Facial Region Displacement Trajectory Series

Multi-feature fusion based face forgery detection with local and global characteristics

FakeTransformer: Exposing Face Forgery From Spatial-Temporal Representation Modeled By Facial Pixel Variations

Deepfake Detection with Data Privacy Protection

UniForensics: Face Forgery Detection via General Facial Representation

An Audio-Visual Attention Based Multimodal Network for Fake Talking Face Videos Detection

Detection of Deepfake Videos Using Long-Distance Attention

Exploiting Facial Relationships and Feature Aggregation for Multi-Face Forgery Detection

ID-Guard: A Universal Framework for Combating Facial Manipulation via Breaking Identification

FPC‐Net: Learning to Detect Face Forgery by Adaptive Feature Fusion of Patch Correlation with CG‐Loss

Hiding Faces in Plain Sight: Defending DeepFakes by Disrupting Face Detection