Abstract:The purpose of this paper is to present an unsupervised video anomaly detection method using Optical Flow decomposition and Spatio-Temporal feature learning (OFST). This method employs a combination of optical flow reconstruction and video frame prediction to achieve satisfactory results. The proposed OFST framework is composed of two modules: the Multi-Granularity Memory-augmented Autoencoder with Optical Flow Decomposition (MG-MemAE-OFD) and a Two-Stream Network based on Spatio-Temporal feature learning (TSN-ST). The MG-MemAE-OFD module is composed of three functional blocks: optical flow decomposition, autoencoder, and multi-granularity memory networks. The optical flow decomposition block is used to extract the main motion information of objects in optical flow, and the granularity memory network is utilized to memorize normal patterns and improve the quality of the reconstructions. To predict video frames, we introduce a two-stream network based on spatiotemporal feature learning (TSN-ST), which adopts parallel standard Transformer blocks and a temporal block to learn spatiotemporal features from video frames and optical flows. The OFST combines these two modules so that the prediction error of abnormal samples is further increased due to the larger reconstruction error. In contrast, the normal samples obtain a lower reconstruction error and prediction error. Therefore, the anomaly detection capability of the method is greatly enhanced. Our proposed model was evaluated on public datasets. Specifically, in terms of the area under the curve (AUC), our model achieved an accuracy of 85.74% on the Ped1 dataset, 99.62% on the Ped2 dataset, 93.89% on the Avenue dataset, and 76.0% on the ShanghaiTech Dataset. Our experimental results show an average improvement of 1.2% compared to the current state-of-the-art.

Enriching Optical Flow with Appearance Information for Action Recognition.

Unsupervised Learning of Scene Flow Estimation Fusing with Local Rigidity.

On the Integration of Optical Flow and Action Recognition

Ordered Pooling of Optical Flow Sequences for Action Recognition

Learning and Distillating the Internal Relationship of Motion Features in Action Recognition.

Action recognition based on optical flow constrained auto-encoder

Unsupervised Motion Representation Enhanced Network for Action Recognition

Skin the sheep not only once: Reusing Various Depth Datasets to Drive the Learning of Optical Flow

OAS-Net: Occlusion Aware Sampling Network for Accurate Optical Flow

OIFlow: Occlusion-Inpainting Optical Flow Estimation by Unsupervised Learning

Dance with Flow: Two-in-One Stream Action Detection

Learning By Analogy: Reliable Supervision From Transformations For Unsupervised Optical Flow Estimation

Im2Flow: Motion Hallucination from Static Images for Action Recognition

Optical Flow as Spatial-Temporal Attention Learners

Video Anomaly Detection Via Successive Image Frame Prediction Leveraging Optical Flows

Flow Dynamics Correction for Action Recognition

An unsupervised video anomaly detection method via Optical Flow decomposition and Spatio-Temporal feature learning

RFRFlow: Recurrent feature refinement network for optical flow estimation

Learning Omnidirectional Flow in 360-degree Video via Siamese Representation

Learning to Estimate Optical Flow Using Dual-Frequency Paradigm

HMAFlow: Learning More Accurate Optical Flow via Hierarchical Motion Field Alignment