DEEP-STA: Deep Learning-Based Detection and Localization of Various Types of Inter-Frame Video Tampering Using Spatiotemporal Analysis

Naheed Akhtar,Muhammad Hussain,Zulfiqar Habib
DOI: https://doi.org/10.3390/math12121778
IF: 2.4
2024-06-08
Mathematics
Abstract:Inter-frame tampering in surveillance videos undermines the integrity of video evidence, potentially influencing law enforcement investigations and court decisions. This type of tampering is the most common tampering method, often imperceptible to the human eye. Until now, various algorithms have been proposed to identify such tampering, based on handcrafted features. Automatic detection, localization, and determine the tampering type, while maintaining accuracy and processing speed, is still a challenge. We propose a novel method for detecting inter-frame tampering by exploiting a 2D convolution neural network (2D-CNN) of spatiotemporal information and fusion for deep automatic feature extraction, employing an autoencoder to significantly reduce the computational overhead by reducing the dimensionality of the feature's space; analyzing long-range dependencies within video frames using long short-term memory (LSTM) and gated recurrent units (GRU), which helps to detect tampering traces; and finally, adding a fully connected layer (FC), with softmax activation for classification. The structural similarity index measure (SSIM) is utilized to localize tampering. We perform extensive experiments on datasets, comprised of challenging videos with different complexity levels. The results demonstrate that the proposed method can identify and pinpoint tampering regions with more than 90% accuracy, irrespective of video frame rates, video formats, number of tampering frames, and the compression quality factor.
mathematics
What problem does this paper attempt to address?