Finding Coherent Motions and Understanding Crowd Scenes : A Diffusion and Clustering-based Approach
Weiyao Lin,Yang Mi,You-Ping Zhong,Weiyue Wang
2015-01-01
Abstract:Coherent motions, which represent coherent movements of massive individual particles, are pervasive in natural and social scenarios. Examples include traffic flows and parades of people (cf. Figs 1a and 2a). Since coherent motions can effectively decompose scenes into meaningful semantic parts and facilitate the analysis of complex crowd scenes, they are of increasing importance in crowd-scene understanding and activity recognition. In this paper, we address the problem of detecting coherent motions in crowd scenes, and subsequently using them to understand input scenes. More specifically, we focus on 1) constructing an accurate coherent motion field to find coherent motions, 2) finding stable semantic regions based on the detected coherent motions and using them to recognize pre-defined activities (i.e., activities with labeled training data) in a crowd scene, and 3) automatically mining recurrent activities in a crowd scene based on the detected coherent motions and semantic regions. First, constructing an accurate coherent motion field is crucial in detecting reliable coherent motions. In Fig. 1, (b) is the input motion field and (c) is the coherent motion field which is constructed from (b) using the proposed approach. In (b), the motion vectors of particles at the beginning of the Marathon queue are far different from those at the end, and there are many inaccurate optical flow vectors. Due to such variations and input errors, it is difficult to achieve satisfying coherent motion detection results directly from (b). However, by transferring (b) into a coherent motion field where the coherent motions among particles are suitably highlighted in (c), coherent motion detection is greatly facilitated. Although many algorithms have been proposed for coherent motion detection [1, 9, 10, 12, 3], this problem is not yet effectively addressed. We argue that a good coherent motion field should effectively be able to 1) en(a) (b) (c) Figure 1. (a) Example frame of a Marathon video sequence, the red curve is the coherent motion region; (b) Input motion vector field of (a); (c) Coherent motion field from (b) using the proposed approach (Best viewed in color). (a) (b) (c) Figure 2. (a) Example time-varying coherent motions in a scene, where different coherent motions are circled by curves with different color; (b) Constructed semantic regions for the scene in (a); (c) Recurrent activities for the scene in (a), where the arrows represent the major motion flows in each recurrent activity (Best viewed in color). code motion correlation among particles, such that particles with high correlations can be grouped into the same coherent region; and, 2) maintain motion information of individual particles, such that activities in crowd scenes can be effectively parsed by the extracted coherent motion field. Based on these intuitions, we propose a thermal-diffusionbased approach, which can extract accurate coherent motion fields. Second, constructing meaningful semantic regions to de-