Hybrid learning framework for web information retrieval

Guang Feng,Kinman Lam,Xudong Zhang,Desheng Wang
DOI: https://doi.org/10.1109/ICNNSP.2008.4590415
2008-01-01
Abstract:Machine learning techniques have been considered a very promising solution to Web information retrieval, which is based on the ranking of the relevance of samples to a query input. However, the connotation of labeling in ranking is quite different from that in classification. Specifically, the labeling of samples for ranking is usually incomplete, i.e. only a part of samples are labeled. In order to remedy this methodological gap, in this paper we propose a hybrid learning framework, called fuzzy-label learning, which consists of two layers. First, we utilize a label-propagation algorithm to estimate those labels of unlabeled samples by their neighborhoods. Second, we adopt RankBoost on the samples with fuzzy labels. Experiments with five-fold cross-validation using the Letor benchmark datasets show that the proposed hybrid learning framework can definitively improve the search performance achieved by the RankBoost algorithm for Web information retrieval.
What problem does this paper attempt to address?