Abstract:The proliferation of fake images generated by deepfake techniques has significantly threatened the trustworthiness of digital information, leading to a pressing need for face forgery detection. However, due to the similarity between human face images and the subtlety of artefact information, most deep face forgery detection methods face certain challenges, such as incomplete extraction of artefact information, limited performance in detecting low-quality forgeries, and insufficient generalization across different datasets. To address these issues, this paper proposes a novel noise-aware multi-scale deepfake detection model. Firstly, a progressive spatial attention module is introduced, which learns two types of spatial feature weights: boosting weight and suppression weight. The boosting weight highlights salient regions, while the suppression weight enables the model to capture more subtle artifact information. Through multiple boosting-suppression stages, the proposed model progressively focuses on different facial regions and extracts multi-scale RGB features. Additionally, a noise-aware two-stream network is introduced, which leverages frequency-domain features and fuses image noise with multi-scale RGB features. This integration enhances the model's ability to handle image post-processing. Furthermore, the model learns global features from multi-modal features through multiple convolutional layers, which are combined with local similarity features for deepfake detection, thereby improving the model's robustness. Experimental results on several benchmark databases demonstrate the superiority of our proposed method over state-of-the-art techniques. Our contributions lie in the progressive spatial attention module, which effectively addresses overfitting in CNNs, and the integration of noise-aware features and multi-scale RGB features. These innovations lead to enhanced accuracy and generalization performance in face forgery detection.

Multi-level feature disentanglement network for cross-dataset face forgery detection

Face Forgery Detection with Long-Range Noise Features and Multilevel Frequency-Aware Clues

Exploring Disentangled Content Information for Face Forgery Detection

Multi-feature fusion based face forgery detection with local and global characteristics

Artifacts-Disentangled Adversarial Learning for Deepfake Detection

MDCF-Net: Multi-Scale Dual-Branch Network for Compressed Face Forgery Detection

UniForensics: Face Forgery Detection via General Facial Representation

Research on video face forgery detection model based on multiple feature fusion network

Exploiting Facial Relationships and Feature Aggregation for Multi-Face Forgery Detection

Exploring Bi-Level Inconsistency Via Blended Images for Generalizable Face Forgery Detection

Generalizing Face Forgery Detection with High-frequency Features

Learning Forgery Region-Aware and ID-Independent Features for Face Manipulation Detection

Face forgery detection by progressively enhancing spatial and frequency-aware features

Combined spatial and frequency dual stream network for face forgery detection

Attention Consistency Refined Masked Frequency Forgery Representation for Generalizing Face Forgery Detection

FedForgery: Generalized Face Forgery Detection with Residual Federated Learning

Common Forgery Artifact Driven Deepfake Face Detection

MC-LCR: Multi-modal contrastive classification by locally correlated representations for effective face forgery detection

Noise-aware progressive multi-scale deepfake detection

Dynamic-Aware Federated Learning for Face Forgery Video Detection

COMICS: End-to-end Bi-grained Contrastive Learning for Multi-face Forgery Detection