A multimedia information fusion framework for web image categorization

Wenting Lu,Lei Li,Jingxuan Li,Tao Li,Honggang Zhang,Jun Guo
DOI: https://doi.org/10.1007/s11042-012-1165-2
IF: 2.577
2012-01-01
Multimedia Tools and Applications
Abstract:With the rapid development of technologies for fast Internet access and the popularization of digital cameras, an enormous number of digital images are posted and shared online everyday. Web images are usually organized by topic and are often assigned appropriate topic-related textual descriptions. Given a large set of images along with the corresponding texts, a challenging problem is how to utilize the available information to efficiently and effectively perform image retrieval tasks, such as image classification and image clustering. Previous approaches on image categorization focus on either adopting text or image features, or simply combining these two types of information together. In this paper, we improve our previously reported two multi-view classification approaches—( Dynamic Weighting and Region-based Semantic Concept Integration ) for categorizing the images under the “supervision” of topic-related textual descriptions—by proposing a novel multimedia information fusion framework , in which these two proposed methods are seamlessly integrated by analyzing the special characteristics of different images. Notice that, the proposed framework is a generic multimedia information fusion framework which is not limited to our previously reported two approaches, and it can also be used to integrate other existing multi-view classification methods or models. Also, our proposed framework is capable of handling the large scale image categorization. Specifically, the proposed framework can automatically choose an appropriate classification model for each testing image according to its special characteristics and consequently achieve better classification performance with relatively less computation time for large scale datasets; Moreover, it is able to categorize images without any textual description in real world applications. Empirical experiments on two different types of web image datasets demonstrate the efficacy and efficiency of our proposed classification framework.
What problem does this paper attempt to address?