Attention-Driven Loss for Anomaly Detection in Video Surveillance

Joey Tianyi Zhou,Le Zhang,Zhiwen Fang,Jiawei Du,Xi Peng,Yang Xiao

DOI: https://doi.org/10.1109/tcsvt.2019.2962229

IF: 5.859

2020-12-01

IEEE Transactions on Circuits and Systems for Video Technology

Abstract:Recent video anomaly detection methods focus on reconstructing or predicting frames. Under this umbrella, the long-standing inter-class data-imbalance problem resorts to the imbalance between foreground and stationary background objects in video anomaly detection and this has been less investigated by existing solutions. Naively optimizing the reconstructing loss yields a biased optimization towards background reconstruction rather than the objects of interest in the foreground. To solve this, we proposed a simple yet effective solution, termed attention-driven loss to alleviate the foreground-background imbalance problem in anomaly detection. Specifically, we compute a single mask map that summarizes the frame evolution of moving foreground regions and suppresses the background in the training video clips. After that, we construct an attention map through the combination of the mask map and background to give different weights to the foreground and background region respectively. The proposed attention-driven loss is independent of backbone networks and can be easily augmented in most existing anomaly detection models. Augmented with attention-driven loss, the model is able to achieve AUC 86.0% on Avenue, 83.9% on Ped1, 96% on Ped2 datasets. Extensive experimental results and ablation studies further validate the effectiveness of our model.

engineering, electrical & electronic

What problem does this paper attempt to address?

This paper attempts to solve the problem of anomaly detection in video surveillance, especially the imbalance between foreground and background in video data. In existing video anomaly detection methods, video frames are usually reconstructed or predicted. However, these methods often overlook the data imbalance between foreground objects (such as moving people or objects) and the static background. This imbalance can lead to a bias towards background reconstruction rather than the objects in the foreground during the optimization process, thus affecting the effectiveness of anomaly detection. To solve this problem, the authors propose a simple and effective method called Attention - Driven Loss. Specifically, they summarize the frame evolution of the moving foreground area by calculating a single mask map and suppress the background in the training video clips. Then, by combining the mask map and the background to construct an attention map, different weights are given to the foreground and background areas respectively. In this way, the model can pay more attention to the objects in the foreground during the optimization process, thereby improving the accuracy of anomaly detection. This method is independent of the backbone network and can be easily integrated into most existing anomaly detection models. Experimental results show that after using Attention - Driven Loss, the performance of the model on datasets such as Avenue, Ped1, and Ped2 has been significantly improved, with AUC reaching 86.0%, 83.9% and 96% respectively. A large number of experimental results and ablation studies further verify the effectiveness of this method.

Attention-Driven Loss for Anomaly Detection in Video Surveillance

Contrastive Attention for Video Anomaly Detection

Video Anomaly Detection Based on Attention Mechanism

Learning Attention Augmented Spatial-temporal Normality for Video Anomaly Detection

Object-Guided and Motion-Refined Attention Network for Video Anomaly Detection

Influence-aware Attention Networks for Anomaly Detection in Surveillance Videos

Dual contrast discriminator with sharing attention for video anomaly detection

Attention-based anomaly detection in multi-view surveillance videos

Video Anomaly Detection Based on Spatio-Temporal Relationships among Objects

Attention-based residual autoencoder for video anomaly detection

A Lightweight Video Anomaly Detection Model with Weak Supervision and Adaptive Instance Selection

Video anomaly detection based on a multi-layer reconstruction autoencoder with a variance attention strategy

Anomalies cannot materialize or vanish out of thin air: A hierarchical multiple instance learning with position-scale awareness for video anomaly detection

Learning Task-Specific Representation for Video Anomaly Detection with Spatial-Temporal Attention

Robust Unsupervised Video Anomaly Detection by Multipath Frame Prediction

Pedestrian Spatio-Temporal Information Fusion For Video Anomaly Detection

Robust Unsupervised Video Anomaly Detection by Multi-Path Frame Prediction

Weakly Supervised Video Anomaly Detection via Center-guided Discriminative Learning

Synthetic Pseudo Anomalies for Unsupervised Video Anomaly Detection: A Simple yet Efficient Framework based on Masked Autoencoder

Enhancement of Video Anomaly Detection Performance Using Transfer Learning and Fine-Tuning

Video Anomaly Detection Based on Global–Local Convolutional Autoencoder