Multimodal Fusion for Video Search Reranking
Shikui Wei,Yao Zhao,Zhenfeng Zhu,Nan Liu
DOI: https://doi.org/10.1109/tkde.2009.145
IF: 9.235
2010-08-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:Analysis on click-through data from a very large search engine log shows that users are usually interested in the top-ranked portion of returned search results. Therefore, it is crucial for search engines to achieve high accuracy on the top-ranked documents. While many methods exist for boosting video search performance, they either pay less attention to the above factor or encounter difficulties in practical applications. In this paper, we present a flexible and effective reranking method, called CR-Reranking, to improve the retrieval effectiveness. To offer high accuracy on the top-ranked results, CR-Reranking employs a cross-reference (CR) strategy to fuse multimodal cues. Specifically, multimodal features are first utilized separately to rerank the initial returned results at the cluster level, and then all the ranked clusters from different modalities are cooperatively used to infer the shots with high relevance. Experimental results show that the search quality, especially on the top-ranked results, is improved significantly.
computer science, information systems, artificial intelligence,engineering, electrical & electronic