Combating Link Spam by Noisy Link Analysis

Yitong Wang,Xiaofei Chen,Xiaojun Feng
DOI: https://doi.org/10.1007/978-3-642-17316-5_43
2010-01-01
Abstract:Link Spam has indentified as one of the major obstacles for linkbased ranking algorithms of modern search engine since it intently constructs hyperlink structure to help some poor-content pages obtaining undeserved high rank. This problem is even worse with the advent of wikis, blogs and forum that are rich in links. Existing works on link spam are mainly focused on link spam detection by extracting some special link structures (e.g. clique, tight bipartite etc.). However, link spam structures could have many variations and easily make the existing detection methods ineffective. In this paper, we tackle the problem of link spam from a more fundamental viewpoint--"noisy link" analysis. First of all, how "non-voting" hyperlinks affect the quality of ranking is investigated, and then based on this investigation, an approach to detect and process "noisy link" both effectively and automatically is proposed. We also compare our work with two other related works (TrustRank and Site-level Noise removal) on two real web datasets. The experimental results demonstrate that the proposed "noisy link" analysis is very effective on both spam page filtering and final ranking improvement.
What problem does this paper attempt to address?