Semantics-based Hyperlink Classification,Design and Implementation

奚伟鹏,李昕,武港山
DOI: https://doi.org/10.3969/j.issn.1001-3695.2004.11.056
2004-01-01
Abstract:The research on hyperlink is playing an important role in Web mining.Propose a framework for automatic semantics-based hyperlink classification,and describe the design and implementation detailedly.In the work,the relevant features,which implicate the hyperlink′s semantic content,will be extracted and quantified automatically.An approach inspired by decision tree technology,using C4.5 algorithm,is applied to make classification.The features selected and decision rules generated is based on machine learning on large numbers of hand tagged training samples.Typing the hyperlink from the semantic point of view,our research is promising to be great helpful in more effectively automatic processing on the Web resources.
What problem does this paper attempt to address?