A Similarity-Based Model for Topic Distillation

Xiaoyu Wang,Hongwei Wu,Li Wei,Aoying Zhou
DOI: https://doi.org/10.1142/S1469026802000592
2002-01-01
International Journal of Computational Intelligence and Applications
Abstract:Topic distillation is the process of finding representative pages relevant to a given query. Well-known topic distillation approaches such as the HITS algorithm have shown to be useful for topic distillation. Many succeeding researchers focus on augmenting HITS with further content analysis to alleviate the steady deterioration of distillation quality suffered by HITS. In this paper, we attempt to revisit the behavior of HITS from a different point of view. Namely, a similarity-based analysis model is applied to observing the distillation procedure. By defining a generalized similarity, an algorithm is proposed, which can improve the quality of distillation only using the information of hyperlinks. The experimental results reveal that the new algorithm improves distillation quality without utilizing any content information of pages.
What problem does this paper attempt to address?