The construct and use of semantic tree of Web pages

Yanbin Zhao,Qinghua Li,Feng Zhao
DOI: https://doi.org/10.3321/j.issn:1671-4512.2005.z1.064
2005-01-01
Abstract:After analyzing the ill-formed Web page, its document object model (DOM) tree has more hiberarchy node and inaccurate children. A fault-tolerance semantic tree of Web pages construct technique is proposed. The output of the semantic tree can be widely offered for research on text classification and cluster, discovering network community, Web topical information extraction and Web information retrieval.
What problem does this paper attempt to address?