Video Anomaly Detection Based on Cross-Frame Prediction Mechanism and Spatio-Temporal Memory-Enhanced Pseudo-3D Encoder.

Xiaopeng Wen,Huicheng Lai,Guxue Gao,Yang Xiao,Tongguan Wang,Zhenhong Jia,Liejun Wang
DOI: https://doi.org/10.1016/j.engappai.2023.107057
IF: 8
2023-01-01
Engineering Applications of Artificial Intelligence
Abstract:Intelligent video anomaly detection (VAD) methods play a crucial role in conserving human resources, reducing the financial burden on governments, and promptly and accurately identifying abnormal behaviors. Frame prediction, which performs VAD by reasonably predicting normal video and distorting predicted anomalous video, is a popular and efficient method. Although Auto-Encoders (AE) show excellent performance in video frame prediction methods, the ability of these methods to use temporal information and poorly reconstruct anomalous videos is insufficient. To improve the shortcomings of AE in VAD, we propose a VAD method based on the cross-frame prediction mechanism and the spatio-temporal memory-enhanced pseudo-3D encoder. This method significantly enhances the recognition capability of abnormal activities in surveillance videos. Firstly, we use the prediction mechanism that uses multiple past frames with intervals to predict future frames. It can broaden the extra information and limit the memory consumption. Then we design the pseudo-3D encoder to encode the spatio-temporal information in videos, avoiding the problems that the 2D encoder cannot obtain the temporal dimensional information and the 3D encoder has complicated structure and overmuch parameters. Finally, we design the spatio-temporal memory block with three loss functions to store the spatio-temporal information of normal videos, which can expand the predicted differences between normal and abnormal examples. Experiments on UCSD Ped2, CUHK Avenue and ShanghaiTech datasets experimentally show that the proposed method achieves 99.4%, 90.5% and 74.3% of the AUC values. Our method shows excellent performance among single-stage semi-supervised anomaly detection methods.
What problem does this paper attempt to address?