Abstract:Intelligent video anomaly detection (VAD) methods play a crucial role in conserving human resources, reducing the financial burden on governments, and promptly and accurately identifying abnormal behaviors. Frame prediction, which performs VAD by reasonably predicting normal video and distorting predicted anomalous video, is a popular and efficient method. Although Auto-Encoders (AE) show excellent performance in video frame prediction methods, the ability of these methods to use temporal information and poorly reconstruct anomalous videos is insufficient. To improve the shortcomings of AE in VAD, we propose a VAD method based on the cross-frame prediction mechanism and the spatio-temporal memory-enhanced pseudo-3D encoder. This method significantly enhances the recognition capability of abnormal activities in surveillance videos. Firstly, we use the prediction mechanism that uses multiple past frames with intervals to predict future frames. It can broaden the extra information and limit the memory consumption. Then we design the pseudo-3D encoder to encode the spatio-temporal information in videos, avoiding the problems that the 2D encoder cannot obtain the temporal dimensional information and the 3D encoder has complicated structure and overmuch parameters. Finally, we design the spatio-temporal memory block with three loss functions to store the spatio-temporal information of normal videos, which can expand the predicted differences between normal and abnormal examples. Experiments on UCSD Ped2, CUHK Avenue and ShanghaiTech datasets experimentally show that the proposed method achieves 99.4%, 90.5% and 74.3% of the AUC values. Our method shows excellent performance among single-stage semi-supervised anomaly detection methods.

Spatio-Temporal AutoEncoder for Video Anomaly Detection.

Video Anomaly Detection Based on 3D Convolutional Auto-Encoder

Two-stream Deep Spatial-Temporal Auto-Encoder for Surveillance Video Abnormal Event Detection

Residual spatiotemporal autoencoder for unsupervised video anomaly detection

Spatio-Temporal Unity Networking for Video Anomaly Detection

Memory Enhanced Spatial-Temporal Graph Convolutional Autoencoder for Human-Related Video Anomaly Detection.

Video Anomaly Detection Based on Adaptive Multiple Auto-Encoders.

Spatial-Temporal Cascade Autoencoder for Video Anomaly Detection in Crowded Scenes.

Spatial Temporal Balanced Generative Adversarial AutoEncoder for Anomaly Detection

Spatiotemporal Masked Autoencoder with Multi-Memory and Skip Connections for Video Anomaly Detection

Video anomaly detection with spatio-temporal dissociation

Video Anomaly Detection Based on Spatio-Temporal Relationships among Objects

Attention-based residual autoencoder for video anomaly detection

Video Anomaly Detection Via Predictive Autoencoder with Gradient-Based Attention

Video Anomaly Detection Based on Cross-Frame Prediction Mechanism and Spatio-Temporal Memory-Enhanced Pseudo-3D Encoder.

Spatiotemporal consistency-enhanced network for video anomaly detection

Abnormal Events Detection Method for Surveillance Video Using an Improved Autoencoder with Multi-Modal Input

Exploiting Spatial-temporal Correlations for Video Anomaly Detection

Video anomaly detection based on a multi-layer reconstruction autoencoder with a variance attention strategy

Video Anomaly Detection Based on Convolutional Recurrent AutoEncoder

Multi Chunk Learning Based Auto Encoder for Video Anomaly Detection