Dor: Detecting Deepfake Videos Using the Dissonance Between Intra- and Inter-Frame Maps

Yang Xiao,Yongzheng Zhang,Yafei Sang,Fengyu Wang
DOI: https://doi.org/10.2139/ssrn.4149719
2022-01-01
SSRN Electronic Journal
Abstract:With the development of generative adversarial networks (GANs) and their variants, they have been successfully applied to the synthesis of facial images. However, the spread of fake information has posed potential security concerns for humans. In particular, the detection of deepfake videos still faces severe challenges. This study argues that if a video is represented by different modalities, manipulating a modality alone, which is a common way to deepfake videos, will lead to a certain dissonance between modalities. Consequently, a novel dual-network scheme referred to as DOR (dissonance between optical flow and RGB images) for deepfake detection is proposed. The proposed method extracts the intra- and inter-frame features from RGB image and optical flow via a dual-network architecture, and captures the dissonance between the extracted features with contrastive losses. The proposed scheme was experimentally evaluated using the latest public datasets: DeeperForensics++, UADFV, DFTIMIT, and Celeb-DF. The results demonstrated that DOR outperformed state-of-the-art methods in general. In particular, for low-resolution mixed datasets, the detection accuracy was improved by up to 6.8%. This study proves the validity of the dissonance between different modilities in deepfake detection, and explores an alternative way for deepfake detection research.
What problem does this paper attempt to address?