Normality learning reinforcement for anomaly detection in surveillance videos

Kai Cheng,Xinhua Zeng,Yang Liu,Yaning Pan,Xinzhe Li
DOI: https://doi.org/10.1016/j.knosys.2024.111942
IF: 8.139
2024-05-23
Knowledge-Based Systems
Abstract:Video Anomaly Detection (VAD) is a key technology that enables automatic anomaly detection in surveillance video systems. Due to the considerable dimensions and fine-grained spatial details of video sequences, VAD tasks necessitate the collaboration of learning both spatial and temporal contextualized information. Due to the restricted understanding of spatial details, previous unsupervised VAD works have yet to fully tap into the potential of prediction-based methods to fully capture the spatial–temporal correlations. Inspired by the lateral geniculate nucleus in visual processing, we propose a Normality Learning Reinforcement framework of VAD (NLR-VAD) that enables learning more sophisticated and efficient spatial–temporal interactions. Specifically, NLR-VAD introduces a Normality Learning Reinforcement Unit (NLRU) rooted in diffusion models. NLRU refines the irrelevant information and amplifies the knowledge retrieval of spatial normality, particularly in high-resolution details. NLRU collaborates with Memory-Augmented Auto-Encoder (MAAE) to enhance spatial–temporal normality through fine-grained extractions. Moreover, a cross-NLRU fusion unit is proposed for information integration as a bridge between NLRU and MAAE. Quantitative results on three benchmarks show that NLR-VAD performs competitively compared with the previous methods. Extended analysis and visualization fully demonstrate the effectiveness of the proposed units.
computer science, artificial intelligence
What problem does this paper attempt to address?