Deep learning based anomaly detection in real-time video

DOI: https://doi.org/10.1007/s11042-024-19116-9
IF: 2.577
2024-05-11
Multimedia Tools and Applications
Abstract:Many security cameras have been put up in places like airports, roads, and banks for the safety of these public places. These cameras make a lot of video data, and most security camera recordings are only ever seen when something strange happens. This means that monitoring has to be done by people, which is time-consuming and often wrong, so automatic ways of monitoring have to be used. In this paper, we propose a system that automatically detects irregular events in videos based on the integration of Inflated 3D Convolution Network (I3D-ResNet50) and deep Multiple Instance Learning (MIL). This system considers both regular and unusual videos as negative and positive packets, respectively. Each video snippet is a case of that packet. An anomaly score is generated for each video snippet using a fully connected Neural Network (NN). After processing videos, we used an I3D-ResNet50 to extract features after applying 10-crop augmentations to the UCF-101 dataset that contains 130 GB of videos with 13 abnormal events such as fighting, stealing, abuse, etc., as well as normal events. Our experimental results show that the AUC is 82.85% with only 10,000 iterations compared with other approaches. This means that our model is better at spotting anomalies in real-time videos.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to automatically detect abnormal events in real - time videos. Specifically, the author proposes a deep - learning - based method, aiming to automatically identify abnormal behaviors in surveillance videos by integrating Inflated 3D Convolutional Network (I3D - ResNet50) and Deep Multiple - Instance Learning (MIL). ### Problem Background With a large number of security cameras installed in public places such as airports, roads and banks, these cameras generate a large amount of video data. However, most of the recordings of security cameras are only viewed when abnormal events occur, which makes manual monitoring time - consuming and error - prone. Therefore, it is necessary to develop automated monitoring systems to improve efficiency and accuracy. ### Research Objectives The goal of the paper is to design a system that can automatically detect abnormal events in videos, so as to reduce the need for manual monitoring and improve the accuracy and real - time performance of detection. To this end, the author proposes the following solutions: 1. **Video Pre - processing**: Divide each training video into a fixed number of time snippets, and classify them as positive packets or negative packets according to whether they contain abnormal events. 2. **Feature Extraction and Abnormal Score Generation**: Use the pre - trained I3D - ResNet50 model to extract spatio - temporal features from video snippets, and generate an abnormal score for each snippet through a fully - connected neural network (FCNN). 3. **Multiple - Instance Learning (MIL)**: Train the network through deep MIL and a ranking loss function to distinguish between abnormal and normal events. ### Key Challenges 1. **Large Amount of Data**: Abnormal events usually account for only a small part of the video, so a large amount of irrelevant data needs to be processed. 2. **Diversity of Abnormal Events**: Different types of abnormal events vary greatly, and it is difficult to create general features. 3. **Time Information Management**: Video processing needs to consider not only spatial information but also time information. ### Experimental Results The experimental results show that the method achieves an AUC of 82.85% on the UCF - Crime dataset, outperforming other existing methods. In addition, the system can achieve good real - time detection performance in practical applications. ### Conclusion This paper proposes a real - time video abnormal detection system based on I3D - ResNet50 and deep MIL, which solves the problems existing in current video surveillance and improves the accuracy and efficiency of abnormal event detection.