Abstract:Many security cameras have been put up in places like airports, roads, and banks for the safety of these public places. These cameras make a lot of video data, and most security camera recordings are only ever seen when something strange happens. This means that monitoring has to be done by people, which is time-consuming and often wrong, so automatic ways of monitoring have to be used. In this paper, we propose a system that automatically detects irregular events in videos based on the integration of Inflated 3D Convolution Network (I3D-ResNet50) and deep Multiple Instance Learning (MIL). This system considers both regular and unusual videos as negative and positive packets, respectively. Each video snippet is a case of that packet. An anomaly score is generated for each video snippet using a fully connected Neural Network (NN). After processing videos, we used an I3D-ResNet50 to extract features after applying 10-crop augmentations to the UCF-101 dataset that contains 130 GB of videos with 13 abnormal events such as fighting, stealing, abuse, etc., as well as normal events. Our experimental results show that the AUC is 82.85% with only 10,000 iterations compared with other approaches. This means that our model is better at spotting anomalies in real-time videos.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to automatically detect abnormal events in real - time videos. Specifically, the author proposes a deep - learning - based method, aiming to automatically identify abnormal behaviors in surveillance videos by integrating Inflated 3D Convolutional Network (I3D - ResNet50) and Deep Multiple - Instance Learning (MIL). ### Problem Background With a large number of security cameras installed in public places such as airports, roads and banks, these cameras generate a large amount of video data. However, most of the recordings of security cameras are only viewed when abnormal events occur, which makes manual monitoring time - consuming and error - prone. Therefore, it is necessary to develop automated monitoring systems to improve efficiency and accuracy. ### Research Objectives The goal of the paper is to design a system that can automatically detect abnormal events in videos, so as to reduce the need for manual monitoring and improve the accuracy and real - time performance of detection. To this end, the author proposes the following solutions: 1. **Video Pre - processing**: Divide each training video into a fixed number of time snippets, and classify them as positive packets or negative packets according to whether they contain abnormal events. 2. **Feature Extraction and Abnormal Score Generation**: Use the pre - trained I3D - ResNet50 model to extract spatio - temporal features from video snippets, and generate an abnormal score for each snippet through a fully - connected neural network (FCNN). 3. **Multiple - Instance Learning (MIL)**: Train the network through deep MIL and a ranking loss function to distinguish between abnormal and normal events. ### Key Challenges 1. **Large Amount of Data**: Abnormal events usually account for only a small part of the video, so a large amount of irrelevant data needs to be processed. 2. **Diversity of Abnormal Events**: Different types of abnormal events vary greatly, and it is difficult to create general features. 3. **Time Information Management**: Video processing needs to consider not only spatial information but also time information. ### Experimental Results The experimental results show that the method achieves an AUC of 82.85% on the UCF - Crime dataset, outperforming other existing methods. In addition, the system can achieve good real - time detection performance in practical applications. ### Conclusion This paper proposes a real - time video abnormal detection system based on I3D - ResNet50 and deep MIL, which solves the problems existing in current video surveillance and improves the accuracy and efficiency of abnormal event detection.

Deep learning based anomaly detection in real-time video

Anomaly Detection Based on a 3D Convolutional Neural Network Combining Convolutional Block Attention Module Using Merged Frames

EADN: An Efficient Deep Learning Model for Anomaly Detection in Videos

Real-Time Anomaly Detection in Video Streams

Artificial Intelligence of Things-assisted two-stream neural network for anomaly detection in surveillance Big Video Data

Anomaly Recognition from surveillance videos using 3D Convolutional Neural Networks

Real-world Video Anomaly Detection by Extracting Salient Features in Videos

Weakly-Supervised Anomaly Detection in Surveillance Videos Based on Two-Stream I3D Convolution Network

Real-world Anomaly Detection in Surveillance Videos

Real time anomalies detection on video

Video Anomaly Detection using Pre-Trained Deep Convolutional Neural Nets and Context Mining

Two‐stage video anomaly detection based on dual‐stream networks and multi‐instance learning

Deep Video Anomaly Detection: Opportunities and Challenges

Enhancement of Video Anomaly Detection Performance Using Transfer Learning and Fine-Tuning

Attention-based residual autoencoder for video anomaly detection

Decoupled appearance and motion learning for efficient anomaly detection in surveillance video

A Deep Learning Approach to Video Anomaly Detection using Convolutional Autoencoders

3D U-Net for Video Anomaly Detection.

DyAnNet: A Scene Dynamicity Guided Self-Trained Video Anomaly Detection Network

A Scalable and Generalized Deep Learning Framework for Anomaly Detection in Surveillance Videos

Video Anomaly Detection Based on Spatio-Temporal Relationships among Objects