Capturing Semantic Hierarchies to Perform Meaningful Integration in HTML Tables

Shijun Li,Mengchi Liu,Guoren Wang,Zhiyong Peng
DOI: https://doi.org/10.1007/978-3-540-24655-8_101
2004-01-01
Abstract:We present a new approach that automatically captures the semantic hierarchies in HTML tables, and semi-automatically integrates HTML tables belonging to a domain. It first automatically captures the attribute-value pairs in HTML tables by normalization and recognizing their headings. After generating global schema manually, it learns the lexical semantic sets and contexts, by which it then eliminates the conflicts and solves the non deterministic problems in mapping each source schema to the global schema to integrate the data in HTML tables.
What problem does this paper attempt to address?