Detection of deepfake technology in images and videos

Yong Liu,Tianning Sun,Zonghui Wang,Xu Zhao,Ruosi Cheng,Baolan Shi
DOI: https://doi.org/10.1504/ijahuc.2024.136851
2024-02-23
International Journal of Ad Hoc and Ubiquitous Computing
Abstract:In response to the low accuracy, weak generalisation, and insufficient consideration of cross-dataset detection in deepfake images and videos, this article adopted the miniXception and long short-term memory (LSTM) combination model to analyse deepfake images and videos. First, the miniXception model was adopted as the backbone network to fully extract spatial features. Secondly, by using LSTM to extract temporal features between two frames, this paper introduces temporal and spatial attention mechanisms after the convolutional layer to better capture long-distance dependencies in the sequence and improve the detection accuracy of the model. Last, cross-dataset training and testing were conducted using the same database and transfer learning method. Focal loss was employed as the loss function in the training model stage to balance the samples and improve the generalisation of the model. The experimental results showed that the detection accuracy on the FaceSwap dataset reached 99.05%, which was 0.39% higher than the convolutional neural network-gated recurrent unit (CNN-GRU) and that the model parameter quantity only needed 10.01 MB, improving the generalisation ability and detection accuracy of the model.
computer science, information systems,telecommunications
What problem does this paper attempt to address?