Harmonic Functions Based Semi-Supervised Learning for Web Spam Detection.

Weifeng Zhang,Danmei Zhu,Yingzhou Zhang,Guoqiang Zhou,Baowen Xu
DOI: https://doi.org/10.1145/1982185.1982204
2011-01-01
Abstract:In web spam detection, we propose a new semi-supervised learning algorithm named HFSSL (harmonic functions based semi-supervised learning). In our method, labeled and unlabeled web pages are represented as vertices in a weighted graph. The learning problem is then modeled as a Gaussian random field on this graph, where the mean of the field is characterized by harmonic functions, which can be efficiently obtained using matrix methods. The experiments on standard WEBSPAM-UK2006 show that our algorithm is effective.
What problem does this paper attempt to address?