Extracting Local Schema from Semistructured Data Based on Graph-Oriented Semantic Model

Wang Tengjiao,Tang Shiwei,Yang Dongqing,Liu Yunfeng,Lin Bin
DOI: https://doi.org/10.1007/bf02943240
IF: 1.871
2001-01-01
Journal of Computer Science and Technology
Abstract:Many modern applications (e-commerce, digital library, etc.) require integrated access to various information sources (from traditional RDBMS to semistructured Web repositories). Extracting schema from semistructured data is a prerequisite to integrate heterogeneous information sources. The traditional method that extracts global schema may require time (and space) to increase exponentially with the number of objects and edges in the source. A new method is presented in this paper, which is about extracting local schema. In this method, the algorithm controls the scale of extracting schema within the “schema diameter” by examining the semantic distance of the target set and using the Hash class and its path distance operation. This method is very efficient for restraining schema from expanding. The prototype validates the new approach.
What problem does this paper attempt to address?