OSIN: Object-Centric Scene Inference Network for Unsupervised Video Anomaly Detection

Yang Liu,Zhengliang Guo,Jing Liu,Chengfang Li,Liang Song
DOI: https://doi.org/10.1109/lsp.2023.3263792
2023-01-01
IEEE Signal Processing Letters
Abstract:Video Anomaly Detection (VAD) is an essential yet challenging task in the signal processing community, which aims to understand the spatial and temporal contextual interactions between objects and surrounding scenes to detect unexpected events in surveillance videos. However, existing unsupervised methods either use a single network to learn global prototype patterns without making a unique distinction between foreground objects and background scenes or try to strip objects from frames, ignoring that the essence of anomalies lies in unusual object-scene interactions. To this end, this letter proposes an Object-centric Scene Inference Network (OSIN) that uses a well-designed three-stream structure to learn both global scene normality and local object-specific normal patterns as well as explore the object-scene interactions using scene memory networks. Experimental results on three benchmark datasets demonstrate the effectiveness of the proposed OSIN model, which achieves frame-level AUCs of 91.7%, 79.6%, and 98.3% on the CUHK Avenue, ShanghaiTech, and UCSD Ped2 datasets, respectively.
What problem does this paper attempt to address?