Similarity Evaluation of Tree-Structured Web Pages

祁钰,关毅,吕新波,岳淑珍
DOI: https://doi.org/10.3969/j.issn.1001-7011.2009.05.012
2009-01-01
Abstract:A similarity calculation method for tree-structured web pages is proposed.The structure of web page labels are firstly transformed into tree,and then make the most similar son nodes between each layer of the two trees continue comparing by a dynamic programming algorithm,the nodes which miss match are regarded the part of distance,the total distance between two trees are computed by adding in all the parts of distance through which to calculate their similarity degree.The experimental result shows that this method can effectively and precisely distinguish different web page.
What problem does this paper attempt to address?