LampMark: Proactive Deepfake Detection via Training-Free Landmark Perceptual Watermarks

Tianyi Wang,Mengxiao Huang,Harry Cheng,Xiao Zhang,Zhiqi Shen
DOI: https://doi.org/10.1145/3664647.3680869
2024-11-26
Abstract:Deepfake facial manipulation has garnered significant public attention due to its impacts on enhancing human experiences and posing privacy threats. Despite numerous passive algorithms that have been attempted to thwart malicious Deepfake attacks, they mostly struggle with the generalizability challenge when confronted with hyper-realistic synthetic facial images. To tackle the problem, this paper proposes a proactive Deepfake detection approach by introducing a novel training-free landmark perceptual watermark, LampMark for short. We first analyze the structure-sensitive characteristics of Deepfake manipulations and devise a secure and confidential transformation pipeline from the structural representations, i.e. facial landmarks, to binary landmark perceptual watermarks. Subsequently, we present an end-to-end watermarking framework that imperceptibly and robustly embeds and extracts watermarks concerning the images to be protected. Relying on promising watermark recovery accuracies, Deepfake detection is accomplished by assessing the consistency between the content-matched landmark perceptual watermark and the robustly recovered watermark of the suspect image. Experimental results demonstrate the superior performance of our approach in watermark recovery and Deepfake detection compared to state-of-the-art methods across in-dataset, cross-dataset, and cross-manipulation scenarios.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the common generalization challenge of current Deepfake detection methods when facing highly realistic synthetic facial images. Although many passive detection algorithms have been tried to prevent malicious Deepfake attacks, most of these methods are difficult to deal with surreal synthetic facial images, and their generalization ability is poor when facing unseen manipulations and datasets. To solve this problem, this paper proposes an active Deepfake detection method by introducing a new training - free Landmark Perceptual Watermark, abbreviated as LampMark. Specifically, the main contributions of the paper are as follows: 1. **Utilizing the structure - sensitive characteristics of facial landmarks**: The paper analyzes the impact of Deepfake operations on facial structures and designs a novel training - free landmark - aware watermark to actively defend against Deepfake attacks. 2. **Proposing a robust watermark embedding and extraction framework**: This framework can robustly embed and extract landmark - aware watermarks into facial images. As far as the authors know, this is the first time that Deepfake operations of face - swapping and face - reenactment can be detected simultaneously in a single robust watermark. 3. **Extensive experimental verification**: Experiments under different datasets, cross - datasets, and cross - operation settings show that this method performs excellently in watermark recovery and Deepfake detection, outperforming the existing state - of - the - art algorithms. ### Formula Display To ensure the correctness and readability of the formulas, the following are some key formulas involved in the paper: - **Covariance matrix calculation**: \[ \text{Cov}(E_{lm})=\frac{1}{\text{len}(E_{lm})}(E_{lm}-\mu_{E_{lm}})^T(E_{lm}-\mu_{E_{lm}}) \] where $\mu_{E_{lm}}$ is the mean vector of $E_{lm}$. - **Feature normalization**: \[ E_{\text{norm}}=\frac{E_{\text{trans}}[:,i]-\min(E_{\text{trans}}[:,i])}{\max(E_{\text{trans}}[:,i])-\min(E_{\text{trans}}[:,i])} \] for all $l$ feature indices $0\leq i < l$. - **Cellular automata encryption rule**: \[ s_{t + 1}^{i}= \begin{cases} s_t^{l-1}\oplus(s_t^0\vee s_t^1)&\text{for }i = 0,\\ s_t^{i-1}\oplus(s_t^i\vee s_t^{i+1})&\text{for }0 < i < l-1,\\ s_t^{l-2}\oplus(s_t^l\vee s_t^0)&\text{for }i = l-1. \end{cases} \] - **Loss function**: - Image reconstruction loss: \[ L_I=\|I_{\text{rec}}-I\|_2 \] - Watermark recovery loss: \[ L_m=\|m_{\text{rec}}-m\|_2 \] - Adversarial loss: \[ L_{\text{adv}}=-\mathbb{E}[\log(D(I))]+\mathbb{E}[\log(1 - D(I_{\text{rec}}))] \] - Deepfake synthetic quality preservation loss: \[ L_G=\|G(I, I_s)-G(I_{\text{rec}})\|_2