Cost-based heuristic algorithm for repairing inconsistent XML document

WU Ai-Hua,WANG Xian-Sheng,TAN Zi-Jing,Wei Wang
DOI: https://doi.org/10.3724/SP.J.1001.2009.03225
2009-01-01
Ruan Jian Xue Bao/Journal of Software
Abstract:Computing a repair for inconsistent XML documents is significant in applications. But getting an optimum repair is a NP complete problem, especially when XML documents violate both the function dependence and the key constraints. This paper proposes a cost-based heuristic algorithm, which can find a repair with the lowest cost in polynomial time. It first scans the original XML documents once to get the inconsistent data. Then it computes the general candidate repairs for each inconsistent data, and gets a whole document repair heuristically based on its cost. The experimental evaluation show that even when XML documents are large, with high percent of dirty elements, and against many different constraints, the algorithm can still run in less than O(n3) w.r.t. the size of inconsistent elements. © by Institute of Software, the Chinese Academy of Sciences.
What problem does this paper attempt to address?