SRTNet: a spatial and residual based two-stream neural network for deepfakes detection
Dengyong Zhang,Wenjie Zhu,Xiangling Ding,Gaobo Yang,Feng Li,Zelin Deng,Yun Song
DOI: https://doi.org/10.1007/s11042-022-13966-x
IF: 2.577
2022-10-10
Multimedia Tools and Applications
Abstract:With the rapid development of Internet technology, the Internet is full of false information, and Deepfakes, as a kind of visual forgery content, brings the greatest impact to people. The existing mainstream Deepfakes public datasets often have millions of frames, and if the first N frames are used to train the model some key features may be lost. If all frames are used, the model is easily overfitted and training often takes several days, which greatly consumes computational resources. Therefore, we propose an adaptive video frame extraction algorithm to extract the required number of frames from all video frames. The algorithm is able to reduce data redundancy and increase feature richness. In addition, we design a two-stream Deepfakes detection network SRTNet by combining the image spatial domain and residual domain, which consists of spatial-stream and residual-stream. The spatial-stream uses the original RGB image as input to capture high-level tampering artifacts. Residual-stream uses three sets of high-pass filters to process the input image to obtain the image residuals to capture the tampering traces. Two-stream parallel training, and the features are concatenated to enable the model to capture tamper features from both spatial and residual domains to achieve better detection performance. The experimental results show that the proposed adaptive frame extraction algorithm can improve the model performance. And the proposed detection network SRTNet achieves better results than previous work on mainstream Deepfake dataset.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering