Abstract:In recent years, the abuse of a face swap technique called deepfake has raised enormous public concerns. So far, a large number of deepfake videos (known as "deepfakes") have been crafted and uploaded to the internet, calling for effective countermeasures. One promising countermeasure against deepfakes is deepfake detection. Several deepfake datasets have been released to support the training and testing of deepfake detectors, such as DeepfakeDetection and FaceForensics++. While this has greatly advanced deepfake detection, most of the real videos in these datasets are filmed with a few volunteer actors in limited scenes, and the fake videos are crafted by researchers using a few popular deepfake softwares. Detectors developed on these datasets may become less effective against real-world deepfakes on the internet. To better support detection against real-world deepfakes, in this paper, we introduce a new dataset WildDeepfake which consists of 7,314 face sequences extracted from 707 deepfake videos collected completely from the internet. WildDeepfake is a small dataset that can be used, in addition to existing datasets, to develop and test the effectiveness of deepfake detectors against real-world deepfakes. We conduct a systematic evaluation of a set of baseline detection networks on both existing and our WildDeepfake datasets, and show that WildDeepfake is indeed a more challenging dataset, where the detection performance can decrease drastically. We also propose two (eg. 2D and 3D) Attention-based Deepfake Detection Networks (ADDNets) to leverage the attention masks on real/fake faces for improved detection. We empirically verify the effectiveness of ADDNets on both existing datasets and WildDeepfake. The dataset is available at: <a class="link-external link-https" href="https://github.com/OpenTAI/wild-deepfake" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

This paper focuses on addressing the challenges of deepfake video detection, particularly in real-world scenarios. With the advancement of deep learning techniques such as autoencoders and generative adversarial networks (GANs), deepfake technology has become easier to implement and more realistic. This has led to the widespread circulation of deepfake videos on the internet, causing significant concerns regarding political manipulation, reputational damage, and potential fabrication of terror events. Therefore, the development of effective deepfake detection methods has become a top priority. To tackle this challenge, researchers have created multiple deepfake datasets, such as DeepfakeDetection and FaceForensics++, to support the training and testing of deepfake detectors. However, most of the real videos in these datasets are filmed in limited scenarios by a few volunteers, while the fake videos are generated by researchers using popular deepfake software. This may result in poor performance of the trained detectors when encountering real-world deepfakes on the Internet. To address the aforementioned issues, this paper proposes a new deepfake dataset called WildDeepfake, which is entirely collected from the Internet. It includes 7,314 facial sequences extracted from 707 deepfake videos. Compared to existing datasets, WildDeepfake offers more diverse scenarios, more characters per scene, and rich facial expressions. Additionally, the deepfake videos in this dataset exhibit high quality as they might have undergone extensive training and careful adjustment using high-resolution facial images. The paper systematically evaluates a set of baseline detection networks on both existing datasets and the WildDeepfake dataset. The results confirm that these detectors perform well on existing datasets but significantly degrade in performance on WildDeepfake, indicating the greater difficulty in detecting real-world deepfakes. To improve detection performance, the paper proposes a new attention mechanism deepfake detection network (ADDNets), which includes both 2D and 3D versions. It utilizes attention masks generated by facial landmark detection to re-weight the low-level features of the face. These re-weighted features are then used for image-level or sequence-level deepfake detection. Experimental results demonstrate that ADDNets show superior detection performance on both existing datasets and WildDeepfake. In conclusion, the main contributions of this paper are the collection and annotation of a new and more challenging real-world deepfake detection dataset called WildDeepfake, as well as the proposal of a novel deepfake detection network architecture, ADDNets, aiming to enhance the detection capability of real-world deepfakes.

WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection

Deepfake Videos in the Wild: Analysis and Detection

Deepfake Videos Detection Based on Image Segmentation with Deep Neural Networks

Towards Understanding of Deepfake Videos in the Wild

Refining Localized Attention Features with Multi-Scale Relationships for Enhanced Deepfake Detection in Spatial-Frequency Domain

A survey on face forgery detection of Deepfake

DF40: Toward Next-Generation Deepfake Detection

Deepfake Generation and Detection: A Benchmark and Survey

Multi-attentional Deepfake Detection

Video Detection Method Based on Temporal and Spatial Foundations for Accurate Verification of Authenticity

Deepfake Detection with Clustering-based Embedding Regularization

Identity-Driven DeepFake Detection

DeepFake MNIST plus : A DeepFake Facial Animation Dataset

A Continual Deepfake Detection Benchmark: Dataset, Methods, and Essentials

A Survey on Deepfake Video Detection

Celeb-DF: A Large-scale Challenging Dataset for DeepFake Forensics

Deepfake Detection for Facial Images with Facemasks

SoK: Facial Deepfake Detectors

Learning a Deep Dual-Level Network for Robust DeepFake Detection

DeepFake MNIST+: A DeepFake Facial Animation Dataset

Fine-Grained Open-Set Deepfake Detection via Unsupervised Domain Adaptation