FLAG: frequency-based local and global network for face forgery detection

DOI: https://doi.org/10.1007/s11042-024-18751-6
IF: 2.577
2024-03-29
Multimedia Tools and Applications
Abstract:Deepfake detection aims to mitigate the threat of manipulated content by identifying and exposing forgeries. However, previous methods primarily tend to perform poorly when confronted with cross-dataset scenarios. To address the above issue, we propose an innovative hybrid network called the Frequency-based Local and Global (FLAG) network to explore local and global information with the help of frequency-domain cues for better generalization capability. In consideration of the fact that forged faces often exhibit flaws in the frequency domain, we design a Frequency-based Attention Enhancement Module (FAEM) to enhance the aggregation of CNN and Vision Transformer (ViT). In this design, local features from CNN are attentively enhanced by selected frequency coefficients in FAEM, facilitating generalizable global features learning by the ViT module. The effectiveness of the proposed method is validated via numerous experiments and the generalization performance is improved under cross-dataset scenarios. Especially, the proposed method have obtained an AUC of 99.26% and an ACC of 96.56% using intra-dataset experimental results on FaceForensics++ (C23).
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering
What problem does this paper attempt to address?