WATCHER: Wavelet-Guided Texture-Content Hierarchical Relation Learning for Deepfake Detection

Yuan Wang,Chen Chen,Ning Zhang,Xiyuan Hu
DOI: https://doi.org/10.1007/s11263-024-02116-5
IF: 13.369
2024-05-25
International Journal of Computer Vision
Abstract:Breathtaking advances in face forgery techniques produce visually untraceable deepfake videos, thus potential malicious abuse of these techniques has sparked great concerns. Existing deepfake detectors primarily focus on specific forgery patterns with global features extracted by CNN backbones for forgery detection. Due to inadequate exploration of content and texture features, they often suffer from overfitting method-specific forged regions, thus exhibiting limited generalization to increasingly realistic forgeries. In this paper, we propose a Wa velet-guided T exture- C ontent H i E rarchical R elation (WATCHER) Learning framework to delve deeper into the relation-aware texture-content features. Specifically, we propose a Wavelet-guided AutoEncoder scheme to retrieve the general visual representation, which is aware of high-frequency details for understanding forgeries. To further excavate fine-grained counterfeit clues, a Texture-Content Attention Maps Learning module is presented to enrich the contextual information of content and texture features via multi-level attention maps in a hierarchical learning protocol. Finally, we propose a Progressive Multi-domain Feature Interaction module in pursuit to perform semantic reasoning on relationship-enhanced texture-content forgery features. Extensive experiments on popular benchmark datasets substantiate the superiority of our WATCHER model, consistently trumping state-of-the-art methods by a significant margin.
computer science, artificial intelligence
What problem does this paper attempt to address?