Abstract:With the rapid development of deep learning technology, videos with changed faces generated by deep neural networks (i.e., Deepfake videos) become more and more indistinguishable. As a result, the threat raised by Deepfake videos becomes greater and greater. In literature, there are some convolutional neural networks- based detection algorithms for fake face videos. Although those algorithms perform well when the training set and the testing set are from the same dataset, their performance could deteriorate dramatically in cross-dataset scenario where the training and the testing sets are from different sources. Motivated by the fabrication course of fake face videos, this article attempts to solve the problem of fake faces detection with the way of image splicing detection. A neural network borrowed from image segmentation is adopted for predicting the tampered face area from which a tampering mask is obtained through denoising and thresholding the probability map. Using the prior knowledge of face tampering that the changing of face mainly happens in face region, a new way is proposed to determine the Face-Intersection over Union (Face-IoU) and to further improve the ratio calculation method. The Face-Intersection over Union with Penalty (Face-IoUP) is used as the classification criterion for deepfake video detection. The proposed method is impletmented using three basic image segmentation neural networks separately and is tested them on datasets of TIMIT, FaceForensics++, Fake Face in the Wild(FFW). Compared with current methods in literature, the HTER (Half Total Error Rate) in cross-dataset test decreases significantly while the detection accuracy in intra-dataset test keeps high. For the Deep Fake Detection(DFD) dataset with higher synthesis quality, the proposed method still performs very well. Experimental results validate the proposed method and demonstrate its good generality.

Double-Stream Segmentation Network with Temporal Self-attention for Deepfake Video Detection

Hierarchical Supervisions with Two-Stream Network for Deepfake Detection.

Deepfake Videos Detection Based on Image Segmentation with Deep Neural Networks

Refining Localized Attention Features with Multi-Scale Relationships for Enhanced Deepfake Detection in Spatial-Frequency Domain

Towards Spatio-temporal Collaborative Learning: An End-to-End Deepfake Video Detection Framework.

Multi-attentional Deepfake Detection

SRTNet: a spatial and residual based two-stream neural network for deepfakes detection

Exposing Deepfake Videos with Spatial, Frequency and Multi-scale Temporal Artifacts

Interactive Two-Stream Network Across Modalities for Deepfake Detection

Detecting Deepfake Video by Learning Two-Level Features with Two-Stream Convolutional Neural Network.

Video Detection Method Based on Temporal and Spatial Foundations for Accurate Verification of Authenticity

Locate and Verify: A Two-Stream Network for Improved Deepfake Detection

DeepFake detection method based on multi-scale interactive dual-stream network

Temporal Consistency Based Deep Face Forgery Detection Network.

Detection of Deepfake Videos Using Long-Distance Attention

Deepfake Detection Using Fusion Channel Information in a Multi-attentional Model

DeepFidelity: Perceptual Forgery Fidelity Assessment for Deepfake Detection

Detecting Deepfake-Forged Contents with Separable Convolutional Neural Network and Image Segmentation

Dynamic Difference Learning with Spatio-temporal Correlation for Deepfake Video Detection

Undercover Deepfakes: Detecting Fake Segments in Videos

Multi-feature fusion based face forgery detection with local and global characteristics