The MediaMill TRECVID 2011 Semantic Video Search Engine.

Cees G. M. Snoek,Koen E. A. van de Sande,Xirong Li,Masoud Mazloom,Yu-Gang Jiang,Dennis C. Koelma,Arnold W. M. Smeulders
2011-01-01
Abstract:In this paper we describe our TRECVID 2011 video retrieval experiments. The MediaMill team participated in two tasks: semantic indexing and multimedia event detection. The starting point for the MediaMill detection approach is our top-performing bag-of-words system of TRECVID 2010, which uses multiple color SIFT descriptors, sparse codebooks with spatial pyramids, and kernel-based machine learning. All supported by GPU-optimized algorithms, approximated histogram intersection kernels, and multi-frame video processing. This year our experiments focus on 1) the soft assignment of descriptors with the use of difference coding, 2) the exploration of bag-of-words for event detection, and 3) the selection of informative concepts out of 1,346 concept detectors as a representation for event detection. The 2011 edition of the TRECVID benchmark has again been a fruitful participation for the MediaMill team, resulting in the runner-up ranking for concept detection in the semantic indexing task.
What problem does this paper attempt to address?