Analysis and Implementation of Extraction Algorithm of Web Hierarchy Structure

FENG Yan,WANG Shen-kang
DOI: https://doi.org/10.3785/j.issn.1008-973x.2005.10.010
2005-01-01
Abstract:A method for rebuilding Web hierarchy structure in order to increase the efficiency of search engine,Web management and recommender system,etc.was proposed.By analyzing the structural information,such as information,directory information and link information the Web tag structure graph was defined and built on the basis of artificial intelligence and graph theory.The Dijkstral algorithm was applied to extract the hierarchy of the Web site.The algorithm structure is composed of five layers: display layer,Web layer,page analysis layer,pretreatment layer and link layer.Experimental results show that the algorithm can be implemented with great efficiency and speed,and that the Web hierarchy structure is correct.
What problem does this paper attempt to address?