STUDY AND IMPROVEMENT ON LINKAGE SIMILARITY-BASED WEB MINING ALGORITHM

Yang Yifan,Zhu Ming,Li Huahu
DOI: https://doi.org/10.3969/j.issn.1000-386X.2011.01.082
2011-01-01
Abstract:On the basis of Web mining classification pattern,a Web structure mining algorithm HITS based on linked-analysis is studied and analyzed in this paper.An improved DS-HITS algorithm is proposed in light of the shortcomings of HITS Algorithm which only considers the linked into and out of web pages based on root sets but does not consider the similarities of linked into and out of web pages in the acquiring course of expanded sets processing.Many kinds of weights reflecting the pages'similarities are introduced in this improved algorithm in the course of expanded sets processing,so that the core values and authorities of the acquired pages are to be improved significantly.Finally,the searching results of DS-HITS and HITS algorithm are compared based on the initial data of Webla's open source project.
What problem does this paper attempt to address?