Herd Mentality in Augmentation -- Not a Good Idea! A Robust Multi-stage Approach towards Deepfake Detection

Monu,Rohan Raju Dhanakshirur
2024-10-08
Abstract:The rapid increase in deepfake technology has raised significant concerns about digital media integrity. Detecting deepfakes is crucial for safeguarding digital media. However, most standard image classifiers fail to distinguish between fake and real faces. Our analysis reveals that this failure is due to the model's inability to explicitly focus on the artefacts typically in deepfakes. We propose an enhanced architecture based on the GenConViT model, which incorporates weighted loss and update augmentation techniques and includes masked eye pretraining. This proposed model improves the F1 score by 1.71% and the accuracy by 4.34% on the Celeb-DF v2 dataset. The source code for our model is available at <a class="link-external link-https" href="https://github.com/Monu-Khicher-1/multi-stage-learning" rel="external noopener nofollow">this https URL</a>
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
The problems that this paper attempts to solve are several key challenges in deepfake detection. Specifically, the author points out the following issues: 1. **Limitations of Existing Image Classifiers**: - Existing standard image classifiers are unable to accurately distinguish between real and fake human faces. The main reason is that these models cannot clearly focus on the artefacts commonly found in deepfakes. This leads to poor performance of these models in real - world applications. 2. **Improper Use of Data Augmentation Techniques**: - Many existing deepfake detection methods use standard data augmentation techniques (such as Gaussian noise, random brightness contrast, and sharpening). The fake images generated by these techniques disrupt the ideal detection conditions and affect the performance of the model. 3. **Over - reliance of the Model on Eye Features**: - Deep neural networks focus too much on human eyes as distinguishing features during the learning process. This causes the model to be prone to over - fitting and perform poorly when dealing with other facial features. 4. **Class Imbalance Problem**: - There is a serious class imbalance problem in deepfake detection datasets, that is, the number of fake images far exceeds the number of real images. This imbalance affects the generalization ability of the model, making the model more likely to classify all images as fake. To address these problems, the author proposes a multi - stage method to improve deepfake detection, specifically including: - **Improved Data Augmentation Techniques**: Only use basic augmentation techniques (such as rotation and flipping) to avoid introducing noise. - **Pre - training with Eye - Occluded Data**: By pre - training on a dataset with occluded eyes, the model can learn other facial features. - **Weighted Loss Function**: Introduce a weighted loss function to solve the class imbalance problem and improve the model's ability to recognize real images. Through these improvements, the author's model has a 1.71% increase in the F1 score and a 4.34% increase in accuracy on the Celeb - DF v2 dataset.