MAXLCA: a New Query Semantic Model for XML Keyword Search

Ning Gao,Zhi-Hong Deng,Jia-Jian Jiang,Hang Yu
2012-01-01
Journal of Web Engineering
Abstract:Keyword search enables web users to easily access XML data without understanding the complex data schemas. However, the ambiguity of keyword search makes it arduous to select qualified data nodes matching keywords. To address this challenge in XML datasets whose documents have a relatively low average size, we present a new keyword query semantic model: MAXimal Lowest Common Ancestor (MAXLCA). MAXLCA can effectively avoid false negative problem observed in ELCA, SLCA and XSeek. Furthermore, we construct an algorithm GMAX for MAXLCA-based queries that is proved efficient in evaluations. Experiments on INEX show that the search engine using MAXLCA and GMAX outperforms in all three comparative criteria: effective, efficient and processing scalability.
What problem does this paper attempt to address?