Motion Driven Approaches to Shot Boundary Detection, Low-Level Feature Extraction and BBC Rushes Characterization at TRECVID 2005

Chong-Wah Ngo,Zailiang Pan,Xiaoyong Wei,Xiao Wu,Hung-Khoon Tan,Wanlei Zhao
2005-01-01
Abstract:This paper describes our experimental results on shot boundary detection (SB), low-level feature extraction (LLF), and BBC Rushes exploration (BR) at TRECVID 2005. The approaches presented in this paper are mostly based on our previous works (1, 2, 3) grounded on motion analysis with spatio-temporal slices, optical flows and tensor representation. This year, our aim is to explore and investigate the role of motion in various fundamental tasks including video structuring and characterization for both the edited (in SB and LLF) and unedited (in BR) videos. In SB (system C), we exploit the coherence and patterns of motion texture in spatio-temporal slices for boundary detection and classification. The cut and wipe detectors are based on our work in (1) which performs color-texture segmentation on three slices extracted from videos to determine boundaries. The dissolve detector is based on our work in (3) which is composed of two steps: multi-resolution cut detection and binary classification with Gabor features. We submit 10 runs, depending on the size of training data, flashlight detection capability, and additional statistical features (in addition to Gabor) for classification. Overall, the runs with additional features get better results. Increasing training size will sometime deteriorate the precision of detection. In LLF (system A), a global 6-parameter affine model is estimated at each frame with LMedS and tensor representation for camera motion annotation. To characterize the changes of motion parameters over frames, we use hysteresis thresholding and Kalman polyline estimation developed in (2) to segment and determine the types of motion in shots. We submit 7 runs for LLF, depending on several empirical parameters. Overall, there is no significant difference in term of recall and precision for each run. In BR (system A), we study two problems: how to structure and characterize BBC rushes? We define three types of segments based on motion: intentional motion (IM), intermediate mo- tion (IMM), shaking artifacts (SA) for structuring. Our aim is to partition-and-classify (or classify-and-partition) the videos into segments corresponding to their motion characteristics.
What problem does this paper attempt to address?