Abstract:Intelligent video anomaly detection (VAD) methods play a crucial role in conserving human resources, reducing the financial burden on governments, and promptly and accurately identifying abnormal behaviors. Frame prediction, which performs VAD by reasonably predicting normal video and distorting predicted anomalous video, is a popular and efficient method. Although Auto-Encoders (AE) show excellent performance in video frame prediction methods, the ability of these methods to use temporal information and poorly reconstruct anomalous videos is insufficient. To improve the shortcomings of AE in VAD, we propose a VAD method based on the cross-frame prediction mechanism and the spatio-temporal memory-enhanced pseudo-3D encoder. This method significantly enhances the recognition capability of abnormal activities in surveillance videos. Firstly, we use the prediction mechanism that uses multiple past frames with intervals to predict future frames. It can broaden the extra information and limit the memory consumption. Then we design the pseudo-3D encoder to encode the spatio-temporal information in videos, avoiding the problems that the 2D encoder cannot obtain the temporal dimensional information and the 3D encoder has complicated structure and overmuch parameters. Finally, we design the spatio-temporal memory block with three loss functions to store the spatio-temporal information of normal videos, which can expand the predicted differences between normal and abnormal examples. Experiments on UCSD Ped2, CUHK Avenue and ShanghaiTech datasets experimentally show that the proposed method achieves 99.4%, 90.5% and 74.3% of the AUC values. Our method shows excellent performance among single-stage semi-supervised anomaly detection methods.

Memory-guided Representation Matching for Unsupervised Video Anomaly Detection

Enhanced Memory Adversarial Network for Anomaly Detection

Memory-Augmented Spatial-Temporal Consistency Network for Video Anomaly Detection.

Exploiting Spatial-temporal Correlations for Video Anomaly Detection

Learning Anomalies with Normality Prior for Unsupervised Video Anomaly Detection

Anomaly Detection with Prototype-Guided Discriminative Latent Embeddings

Temporal-Aware Self-Supervised Learning for Unsupervised Video Anomaly Detection

Dual Memory Units with Uncertainty Regulation for Weakly Supervised Video Anomaly Detection

Sensing Anomalies Like Humans: A Hominine Framework to Detect Abnormal Events from Unlabeled Videos

Video Anomaly Detection Based on Adaptive Multiple Auto-Encoders.

Self-Supervised Attentive Generative Adversarial Networks for Video Anomaly Detection

Transformer Based Memory Network for Video Anomaly Detection

Collaborative Normality Learning Framework for Weakly Supervised Video Anomaly Detection

Appearance Blur-driven AutoEncoder and Motion-guided Memory Module for Video Anomaly Detection

Long Short-Term Dynamic Prototype Alignment Learning for Video Anomaly Detection

Generalized Video Anomaly Event Detection: Systematic Taxonomy and Comparison of Deep Models.

Learning Appearance-motion Normality for Video Anomaly Detection.

A Novel Unsupervised Video Anomaly Detection Framework Based on Optical Flow Reconstruction and Erased Frame Prediction

Video Anomaly Detection Based on Cross-Frame Prediction Mechanism and Spatio-Temporal Memory-Enhanced Pseudo-3D Encoder.

Learning Appearance-Motion Synergy Via Memory-Guided Event Prediction for Video Anomaly Detection

Video Anomaly Detection via Spatio-Temporal Pseudo-Anomaly Generation : A Unified Approach