Abstract:Nowadays, deepfake is wreaking havoc on society. Deepfake content is created with the help of artificial intelligence and machine learning to replace one person’s likeness with another person in pictures or recorded videos. Although visual media manipulations are not new, the introduction of deepfakes has marked a breakthrough in creating fake media and information. These manipulated pictures and videos will undoubtedly have an enormous societal impact. Deepfake uses the latest technology like Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) to construct automated methods for creating fake content that is becoming increasingly difficult to detect with the human eye. Therefore, automated solutions employed by DL can be an efficient approach for detecting deepfake. Though the “black-box” nature of the DL system allows for robust predictions, they cannot be completely trustworthy. Explainability is the first step toward achieving transparency, but the existing incapacity of DL to explain its own decisions to human users limits the efficacy of these systems. Though Explainable Artificial Intelligence (XAI) can solve this problem by interpreting the predictions of these systems. This work proposes to provide a comprehensive study of deepfake detection using the DL method and analyze the result of the most effective algorithm with Local Interpretable Model-Agnostic Explanations (LIME) to assure its validity and reliability. This study identifies real and deepfake images using different Convolutional Neural Network (CNN) models to get the best accuracy. It also explains which part of the image caused the model to make a specific classification using the LIME algorithm. To apply the CNN model, the dataset is taken from Kaggle, which includes 70 k real images from the Flickr dataset collected by Nvidia and 70 k fake faces generated by StyleGAN of 256 px in size. For experimental results, Jupyter notebook, TensorFlow, NumPy, and Pandas were used as software, InceptionResnetV2, DenseNet201, InceptionV3, and ResNet152V2 were used as CNN models. All these models’ performances were good enough, such as InceptionV3 gained 99.68% accuracy, ResNet152V2 got an accuracy of 99.19%, and DenseNet201 performed with 99.81% accuracy. However, InceptionResNetV2 achieved the highest accuracy of 99.87%, which was verified later with the LIME algorithm for XAI, where the proposed method performed the best. The obtained results and dependability demonstrate its preference for detecting deepfake images effectively.

Detection of deepfake technology in images and videos

Detecting Deepfake Videos Based on Spatiotemporal Attention and Convolutional LSTM

Refining Localized Attention Features with Multi-Scale Relationships for Enhanced Deepfake Detection in Spatial-Frequency Domain

Deep fake video/image detection using deep learning

Deep Convolutional Pooling Transformer for Deepfake Detection

DeepFake detection method based on multi-scale interactive dual-stream network

Detecting Deepfake Images Using Deep Learning Techniques and Explainable AI Methods

A Hybrid CNN-LSTM Approach for Precision Deepfake Image Detection Based on Transfer Learning

Improved Xception with Dual Attention Mechanism and Feature Fusion for Face Forgery Detection

Adt: anti-deepfake transformer

DeepFake detection algorithm based on improved vision transformer

Deepfake detection: Enhancing performance with spatiotemporal texture and deep learning feature fusion

MCW: A Generalizable Deepfake Detection Method for Few-Shot Learning

Hybrid Transformer Network for Deepfake Detection

Noise-aware progressive multi-scale deepfake detection

An efficient deepfake video detection using robust deep learning

Multi-attentional Deepfake Detection

Video Detection Method Based on Temporal and Spatial Foundations for Accurate Verification of Authenticity

Deep Fake Face Detection Using Long Short-Term Memory with Deep Learning Approach

FFR_FD: Effective and fast detection of DeepFakes via feature point defects

Towards More General Video-based Deepfake Detection through Facial Feature Guided Adaptation for Foundation Model