Abstract:The growing rate of public space CCTV installations has generated a need for automated methods for exploiting video surveillance data including scene understanding, query, behaviour annotation and summarization. For this reason, extensive research has been performed on surveillance scene understanding and analysis. However, most studies have considered single scenes, or groups of adjacent scenes. The semantic similarity between different but related scenes (e.g., many different traffic scenes of similar layout) is not generally exploited to improve any automated surveillance tasks and reduce manual effort. Exploiting commonality, and sharing any supervised annotations, between different scenes is however challenging due to: Some scenes are totally un-related -- and thus any information sharing between them would be detrimental; while others may only share a subset of common activities -- and thus information sharing is only useful if it is selective. Moreover, semantically similar activities which should be modelled together and shared across scenes may have quite different pixel-level appearance in each scene. To address these issues we develop a new framework for distributed multiple-scene global understanding that clusters surveillance scenes by their ability to explain each other's behaviours; and further discovers which subset of activities are shared versus scene-specific within each cluster. We show how to use this structured representation of multiple scenes to improve common surveillance tasks including scene activity understanding, cross-scene query-by-example, behaviour classification with reduced supervised labelling requirements, and video summarization. In each case we demonstrate how our multi-scene model improves on a collection of standard single scene models and a flat model of all scenes.

Learning Semantic Scene Models by Object Classification and Trajectory Clustering

From Time to Space: Automatic Annotation of Unmarked Traffic Scene Based on Trajectory Data.

Vision-Based Moving Objects Detection with Background Modeling

Mining Semantic Context Information for Intelligent Video Surveillance of Traffic Scenes

Research on Recognition and Classification of Moving Objects in Mixed Traffic Based on Video Detection

Semantic scene upgrades for trajectory prediction

Spatiotemporal Analysis of Static and Dynamic Traffic Elements from Road Scenes.

Learning a Scene Contextual Model for Tracking and Abnormality Detection

M4L: Maximum Margin Multi-instance Multi-cluster Learning for Scene Modeling

Semantic-based surveillance video retrieval

SemanticFormer: Holistic and Semantic Traffic Scene Representation for Trajectory Prediction using Knowledge Graphs

Discovery of Shared Semantic Spaces for Multi-Scene Video Query and Summarization

Video Sensor-Based Complex Scene Analysis with Granger Causality.

An Approach for Construct Semantic Map with Scene Classification and Object Semantic Segmentation

High Efficient Moving Object Extraction and Classification in Traffic Video Surveillance

Online Sequence Clustering Algorithm for Video Trajectory Analysis

SOM Based Activity Learning for Visual Surveillance System

Trajectory-Based Scene Understanding Using Dirichlet Process Mixture Model

An Online Approach: Learning-Semantic-Scene-By-Tracking And Tracking-By-Learning-Semantic-Scene

A Prototype Learning Framework Using EMD: Application to Complex Scenes Analysis

A General Framework of Learning Multi-Vehicle Interaction Patterns from Video