Spatial-Temporal Cascade Autoencoder for Video Anomaly Detection in Crowded Scenes.

Nanjun Li,Faliang Chang,Chunsheng Liu
DOI: https://doi.org/10.1109/tmm.2020.2984093
IF: 7.3
2021-01-01
IEEE Transactions on Multimedia
Abstract:Time-efficient anomaly detection and localization in video surveillance still remains challenging due to the complexity of “anomaly”. In this paper, we propose a cuboid-patch-based method characterized by a cascade of classifiers called a spatial-temporal cascade autoencoder (ST-CaAE), which makes full use of both spatial and temporal cues from video data. The ST-CaAE has two main stages, defined by two proposed neural networks: a spatial-temporal adversarial autoencoder (ST-AAE) and a spatial-temporal convolutional autoencoder (ST-CAE). First, the ST-AAE is used to preliminarily identify anomalous video cuboids and exclude normal cuboids. The key idea underlying ST-AAE is to obtain a Gaussian model to fit the distribution of the regular data. Then in the second stage, the ST-CAE classifies the specific abnormal patches in each anomalous cuboid with reconstruction error based strategy that takes advantage of the CAE and skip connection. A two-stream framework is utilized to fuse the appearance and motion cues to achieve more complete detection results, taking the gradient and optical flow cuboids as inputs for each stream. The proposed ST-CaAE is evaluated using three public datasets. The experimental results verify that our framework outperforms other state-of-the-art works.
What problem does this paper attempt to address?