KCAM: concentrating on structural similarity for XML fragments

Lingbo Kong,Shiwei Tang,Dongqing Yang,Tengjiao Wang,Jun Gao
DOI: https://doi.org/10.1007/11775300_4
2006-01-01
Abstract:This paper proposes a new method, KCAM, to measure the structural similarity of XML fragments satisfying given keywords. Its name is derived directly after the key structure in this method, Keyword Common Ancestor Matrix. One KCAM for one XML fragment is a k × k upper triangle matrix. Each element ai, j stores the level information of the SLCA (Smallest Lowest Common Ancestor) node corresponding to the keywords ki, kj. The matrix distance between KCAMs, denoted as KDist(), can be used as the approximate structural similarity. KCAM is independent of label information in fragments. It is powerful to distinguish the structural difference between XML fragments.
What problem does this paper attempt to address?