DEEPFAKER: A Unified Evaluation Platform for Facial Deepfake and Detection Models

Li Wang,Xiangtao Meng,Dan Li,Xuhong Zhang,Shouling Ji,Shanqing Guo
DOI: https://doi.org/10.1145/3634914
IF: 2.717
2024-01-01
ACM Transactions on Privacy and Security
Abstract:Deepfake data contains realistically manipulated faces—its abuses pose a huge threat to the security and privacy-critical applications. Intensive research from academia and industry has produced many deepfake/detection models, leading to a constant race of attack and defense. However, due to the lack of a unified evaluation platform, many critical questions on this subject remain largely unexplored. How is the anti-detection ability of the existing deepfake models? How generalizable are existing detection models against different deepfake samples? How effective are the detection APIs provided by the cloud-based vendors? How evasive and transferable are adversarial deepfakes in the lab and real-world environment? How do various factors impact the performance of deepfake and detection models? To bridge the gap, we design and implement DEEPFAKER 1 a unified and comprehensive deepfake detection evaluation platform. Specifically, DEEPFAKER has integrated 10 state-of-the-art deepfake methods and 9 representative detection methods, while providing a user-friendly interface and modular design that allows for easy integration of new methods. Leveraging DEEPFAKER , we conduct a large-scale empirical study of facial deepfake/detection models and draw a set of key findings: (i) the detection methods have poor generalization on samples generated by different deepfake methods; (ii) there is no significant correlation between anti-detection ability and visual quality of deepfake samples; (iii) the current detection APIs have poor detection performance and adversarial deepfakes can achieve about 70% attack success rate on all cloud-based vendors, calling for an urgent need to deploy effective and robust detection APIs; (iv) the detection methods in the lab are more robust against transfer attacks than the detection APIs in the real-world environment; and (v) deepfake videos may not always be more difficult to detect after video compression. We envision that DEEPFAKER will benefit future research on facial deepfake and detection.
What problem does this paper attempt to address?