Multi-information fusion for uncertain semantic representations of videos.

Bo Lu,Guoren Wang,Xiaofeng Gong
DOI: https://doi.org/10.1145/1871437.1871684
2010-01-01
Abstract:Concept-Based Semantic Video Retrieval(CBSVR) usually uses semantic representations of videos to handle user's retrieval requests. It is obvious that the accuracy of semantic video retrieval depends on results of concept detectors, but the detection results are usually imprecise and uncertain . In this paper, we propose a multi-information fusion approach (MIF) which is dedicated to solving the problem of uncertain semantic representations of videos for improving retrieval accuracy. This approach is based on a novel two-phase framework that involves the inferring phase and the fusing phase. In the inferring phase, the most relevant concepts to the user's query are chosen by exploring both contextual correlation among concepts and temporal correlation among shots. In the fusing phase, the inferred probabilities of the related concepts are fused together with the detection results via minimization of potential function to refine the detector prediction. Experiments on the widely used TRECVID datasets demonstrate that our approach can effectively improve the accuracy of semantic concept detection.
What problem does this paper attempt to address?