XML Documents Structured Cluster

HAO Xiao-li,FENG Zhi-yong
2005-01-01
Journal of Computer Applications
Abstract:This article proposed a novel way for clustering XML documents against the defects of the methods in existence. Based on the conception of segment matching, calculation the similarity of two XML trees, which was used to measure the similarity between the two integrated XML trees. In the whole clustering procession, it equiped each cluster with XML cluster representative, which subsumed the most typical structural specifics of a set of XML documents. The cluster representative was constructed by three successive steps named Tree matching, Tree merging and Tree pruning. Then clustering was accomplished by comparing cluster representatives, and updating the representatives as soon as new clusters are detected. And finally the effectiveness of the clustering method is evaluated by testing results.
What problem does this paper attempt to address?