Web Image Annotation Based On Automatically Obtained Noisy Training Set

Mei Wang,Xiangdong Zhou,Hongtao Xu
DOI: https://doi.org/10.1007/978-3-540-78849-2_64
2008-01-01
Abstract:Training data acquisition is a problem in large scale statistical learning based web image annotation. A common idea is to build a large training set by analyzing the web content automatically. However, the noisy data is unavoidable involved in this kind of approach. In this paper, we present a novel web image annotation method based on noisy training set using Mixture Component based Local Fisher Discriminant Analysis (MLFDA). In our method, image annotation is viewed as a multiple class classification problem. To alleviate the influence of the noisy data, the separating hyper planes between different classes are learned by kernel-based local fisher discriminant analysis. Then the mixture components for each class are estimated in the subspace, where the noisy modals will gain small weights and play less important role in classification. The experimental results on a real-world web data set of 4000 images show that our method outperforms MBRM [3] and SVM-based method with F-1 measure improving 83% and 18% respectively.
What problem does this paper attempt to address?