XML Structural Clustering Based on the Improved Particle Swarm Optimization

Tong WANG,Da-xin LIU
DOI: https://doi.org/10.3969/j.issn.1000-1220.2007.05.023
2007-01-01
Abstract:In XML data mining, some structural and semantic information is not taken into account. In order to deal with large XML documents set, this paper firstly proposes a novel structural model H-path using ontology and supports and then, maps each XML document to the model, based on which the clustering method using the improved PSO (called CIP) is given. In practice, the CIP algorithm is mixed with the K-means method. The advantage is to skip out of the local optima of the search space to obtain the global optima with satisfactory time expense. The experimental results show that the performance of the proposed techniques outperformed the standard K-means method in clustering constringency and accuracy.
What problem does this paper attempt to address?