Abstract:DeepFakes have raised serious societal concerns, leading to a great surge in detection-based forensics methods in recent years. Face forgery recognition is a standard detection method that usually follows a two-phase pipeline. While those methods perform well in ideal experimental environment, they face challenges when dealing with DeepFakes in the wild involving complex background and multiple faces of varying sizes. Moreover, most face forgery recognition methods can only process one face at a time. One straightforward way to address this issue is to simultaneous process multi-face by integrating face extraction and forgery detection in an end-to-end fashion by adapting advanced object detection architectures. However, as these object detection architectures are designed to capture the discriminative features of different object categories rather than the subtle forgery traces among the faces, the direct adaptation suffers from limited representation ability. In this paper, we propose COMICS, an end-to-end framework for multi-face forgery detection. COMICS integrates face extraction and forgery detection in a seamless manner and adapts to advanced object detection architectures. The proposed bi-grained contrastive learning approach explores face forgery traces at both the coarse- and fine-grained levels. Specifically, coarse-grained level contrastive learning captures the discriminative features among positive and negative proposal pairs at multiple layers produced by the proposal generator, and fine-grained level contrastive learning captures the pixel-wise discrepancy between the forged and original areas of the same face and the pixel-wise content inconsistency among different faces. Extensive experiments on the OpenForensics and FFIW datasets demonstrate that our method outperforms other counterparts and shows great potential for being integrated into various architectures.

Uncovering visual attention-based multi-level tampering traces for face forgery detection

Face Forgery Detection with Long-Range Noise Features and Multilevel Frequency-Aware Clues

Cross-attention based two-branch networks for document image forgery localization in the Metaverse

Refining Localized Attention Features with Multi-Scale Relationships for Enhanced Deepfake Detection in Spatial-Frequency Domain

MSTA-Net: Forgery Detection by Generating Manipulation Trace Based on Multi-scale Self-texture Attention

TruFor: Leveraging all-round clues for trustworthy image forgery detection and localization

Attention Consistency Refined Masked Frequency Forgery Representation for Generalizing Face Forgery Detection

Detect Any Deepfakes: Segment Anything Meets Face Forgery Detection and Localization

Exploring Disentangled Content Information for Face Forgery Detection

Deep Face Forgery Detection

MC-LCR: Multi-modal contrastive classification by locally correlated representations for effective face forgery detection

Learning to mask: Towards generalized face forgery detection

Multi-level feature disentanglement network for cross-dataset face forgery detection

OpenForensics: Large-Scale Challenging Dataset For Multi-Face Forgery Detection And Segmentation In-The-Wild

Learning Forgery Region-Aware and ID-Independent Features for Face Manipulation Detection

FLAG: frequency-based local and global network for face forgery detection

Counterfactual Explanations for Face Forgery Detection via Adversarial Removal of Artifacts

Exploiting Facial Relationships and Feature Aggregation for Multi-Face Forgery Detection

Uncertainty guided test-time training for face forgery detection

COMICS: End-to-end Bi-grained Contrastive Learning for Multi-face Forgery Detection

Deep fake detection using an optimal deep learning model with multi head attention-based feature extraction scheme