Misleading Deep-Fake Detection with GAN Fingerprints

Vera Wesselkamp,Konrad Rieck,Daniel Arp,Erwin Quiring
DOI: https://doi.org/10.48550/arXiv.2205.12543
2022-05-25
Abstract:Generative adversarial networks (GANs) have made remarkable progress in synthesizing realistic-looking images that effectively outsmart even humans. Although several detection methods can recognize these deep fakes by checking for image artifacts from the generation process, multiple counterattacks have demonstrated their limitations. These attacks, however, still require certain conditions to hold, such as interacting with the detection method or adjusting the GAN directly. In this paper, we introduce a novel class of simple counterattacks that overcomes these limitations. In particular, we show that an adversary can remove indicative artifacts, the GAN fingerprint, directly from the frequency spectrum of a generated image. We explore different realizations of this removal, ranging from filtering high frequencies to more nuanced frequency-peak cleansing. We evaluate the performance of our attack with different detection methods, GAN architectures, and datasets. Our results show that an adversary can often remove GAN fingerprints and thus evade the detection of generated images.
Computer Vision and Pattern Recognition,Cryptography and Security,Machine Learning,Image and Video Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to evade existing deep - fake detection methods by removing specific frequency patterns (i.e., GAN fingerprints) in images generated by generative adversarial networks (GANs). Specifically, the paper introduces several simple counter - attack methods. These methods can effectively remove GAN fingerprints in generated images without adjusting the GAN model or interacting with the detection methods, thus enabling these images to evade detection. ### Paper Background Generative adversarial networks (GANs) have made remarkable progress in synthesizing realistic images that can even deceive humans. However, this has also raised concerns about deep - fakes, as these fake images may be used for improper purposes, such as propaganda and false information dissemination. To address this issue, researchers have developed a variety of methods for detecting deep - fakes. These methods are usually based on identifying image artifacts left during the generation process. However, these detection methods also face the challenges of various counter - attack methods. ### Paper Contributions 1. **Counter - attack on GAN fingerprints**: The paper shows that removing characteristic artifacts (i.e., GAN fingerprints) in generated images can be a simple and effective counter - attack method to evade deep - fake detection methods. 2. **Manipulation strategies**: The paper proposes four different methods to modify the frequency spectrum, from simple high - frequency removal to more refined artifact removal. 3. **Comprehensive evaluation**: The paper conducts extensive experimental evaluations on these four attack methods, using three detection methods, four GAN architectures, and two datasets. The results show that these attack methods can significantly reduce the detection rates of various GAN - generated images. ### Main Methods 1. **Non - target fingerprint removal**: - **Frequency bar attack**: Remove the high - frequency part by applying an ideal low - pass filter. This method is simple and effective but will affect image details. 2. **Target fingerprint removal**: - **Mean - spectrum attack**: Calculate the average spectral difference between natural images and generated images, and then subtract this difference from the generated images. - **Frequency - peak attack**: Operate on the periodic peaks in the spectrum and only modify the frequency coefficients that exceed a certain threshold. - **Regression - weight attack**: Use the Lasso regression model to estimate fingerprints, and then change the frequency coefficients inversely proportional to the regression weights. ### Experimental Results - **Attack success rate**: The attack methods in this paper can significantly reduce the detection rate of deep - fake images in most cases, especially under certain detection methods and GAN architectures. - **Image quality**: Although the attack methods have a certain impact on image quality, in most cases, this impact is acceptable. Especially in the frequency bar attack, the image quality remains good even at a relatively high PSNR value. ### Conclusion The paper shows that by removing GAN fingerprints, deep - fake detection methods can be effectively evaded. However, the success rates of different attack methods depend on multiple factors, and there is no universal attack strategy applicable to all situations. This finding emphasizes the importance of continuous competition and iterative research between offense and defense in the field of deep - fake detection.