Collaborative Normality Learning Framework for Weakly Supervised Video Anomaly Detection

Yang Liu,Jing Liu,Mengyang Zhao,Shuang Li,Liang Song
DOI: https://doi.org/10.1109/tcsii.2022.3161061
2022-01-01
Abstract:Video anomaly detection (VAD) under weak supervision aims to temporally locate abnormal clips using the easy-to-obtain video-level labels. In this brief, we introduce the underlying thought of unsupervised VAD to the weakly supervised VAD and propose a collaborative normality learning framework to obtain more discriminative deep representations. Specifically, a deep auto-encoder is first trained in an unsupervised manner to learn the prototypical spatial-temporal patterns of normal videos. Then, both the normal and abnormal videos are used to train a regression module, where the objective is to make the average score of the abnormal videos higher than the maximum score of the normal videos. Finally, the clips in abnormal videos with an anomaly score lower than the average are regarded as normal and used to fine-tune the trained auto-encoder. The unsupervised auto-encoder collaborates with the weakly supervised regression model to extract prototypical features of normal clips, making the learned features of normal and abnormal events more distinguishable. Experimental results on three benchmark datasets show that the proposed framework achieves comparable performance to the state-of-the-art methods. Additionally, the results of ablation studies demonstrate the validity of collaborative normality learning.
What problem does this paper attempt to address?