Similarity Evaluation of XML Documents Based on Weighted Element Tree Model

Chenying Wang,Xiaojie Yuan,Hua Ning,Xin Lian
DOI: https://doi.org/10.1007/978-3-642-03348-3_71
2009-01-01
Abstract:The logical presentation model of XML data is the basis of XML data management. After introducing XML tree models and frequent pattern models, in this paper we have proposed a novel Weighted Element Tree Model (WETM) for measuring the structural similarity of XML documents. This model is a concise form of XML tree models, so the efficiency of the operation on this model is higher than XML tree models. And comparing with frequent pattern models, the WETM enhances the expression ability of structural information of sub trees, which can appreciate the accuracy of similarity evaluation. Moreover, in order to compare the performance of the proposed evaluation algorithm, it is applied to XML documents clustering. The experimental results show that our algorithm is superior to the algorithms based on tree models or frequent pattern models.
What problem does this paper attempt to address?