LINK RELATION-BASED WEB PAGES SIMILARITY SEARCH

Dailu Jin,Yueqin Zhang,Mingxi Zhang
DOI: https://doi.org/10.3969/j.issn.1000-386x.2014.01.016
2014-01-01
Abstract:Web pages similarity search plays important role in many research fields such as Web news recommendation and approximate query,etc.SimRank is a classical similarity computation model,however,it is not adaptable to large Webpage networks because its space and time cost is very high.Utilising the characteristic of SimRank in fast convergence,we propose an efficient Web pages similarity search (WSR)method.It pre-computes 1-hop iterative similarity matrix,and then conducts online computation of 2-hop iterative similarities of the given querying pages and other pages according to the computed 1-hop iterative similarity matrix.The pre-computation and online query processing efficiencies are further improved by static pruning on Web network.Experimental result shows that the WSR evidently reduces the storage cost and pre-computation time cost,and has higher accuracy and fast query responding time.
What problem does this paper attempt to address?