Multimedia Content-Based Visual Retrieval

Wengang Zhou,Houqiang Li,Qi Tian
DOI: https://doi.org/10.1016/b978-0-12-420149-1.00012-0
2014-01-01
Abstract:With the ever explosive growth of multimedia visual data on the Web, content-based visual retrieval has been attracting sufficient attention in both the academia and the industry. Based on the pioneering work of invariant local SIFT feature and the classic Bag-of-Visual-Words model, the last decade has witnessed the fast advance in content-based visual retrieval in the computer vision and multimedia community. The notable characteristic that distinguishes multimedia content-based retrieval from other visual processing problems lies in emphasizing on the scalability to million- or billion-scale database and the query response in real time. Due to the well-known semantic gap problem, most content-based visual retrieval methods target at specific object/scene image retrieval or partial-duplicate Web image retrieval, and great success has been achieved. This chapter investigates the general framework of the multimedia content-based visual retrieval. It overviews the general visual search pipeline and discusses five key modules of the pipeline separately in detail. A series of methods addressing the key issues in each module are introduced. In this chapter, we are focused on discussing the key problems, defining the algorithms, and illustrating the main idea with the goal of scalable retrieval in large-scale image database.
What problem does this paper attempt to address?