Learning Appearance-Motion Synergy Via Memory-Guided Event Prediction for Video Anomaly Detection

Chongye Guo,Hongbo Wang,Yingjie Xia,Guorui Feng
DOI: https://doi.org/10.1109/tcsvt.2023.3297114
IF: 5.859
2024-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:Classic unsupervised anomaly detection learns normative patterns from normal behavior and assumes that unforeseen anomalous behavior will result in significant prediction deviations. However, anomaly detection in specific situations faces challenges in detecting ambiguous behavior in which the abnormal representation is not particularly intuitive. Existing anomaly detection approaches perform poorly for ambiguous behavior due to limited normative representational capacity, resulting in a narrow normality gap. We observe that the ambiguity of behavior comes from the contradiction between the properties of appearance and motion modalities. In this paper, we propose a novel memory-guided autoencoder named appearance-motion synergy autoencoder to detect anomalous behavior by event prediction. To address the above challenge, we leverage the synergy of the normative appearance-motion modalities to strengthen the representation of normative patterns and improve the detection of ambiguous behavior. Specifically, we design the memory networks with dynamic fusion mechanisms to integrate the correlated appearance-motion information and to remember normal patterns. A consistency measurement unit is designed to optimize the consistency of normative appearance-motion features via a joint distribution measurement pool. A larger normality gap in detecting ambiguous behavior in our approach enhances the abnormal detection capability. Extensive experiments demonstrate our superiority in detecting anomalous behavior.
What problem does this paper attempt to address?