Using Ordered Mutual Information to Match Schema with Opaque Column Names and Data Values

Lele GUO,Youfang LIN,Sheng HAN
DOI: https://doi.org/10.3778/j.issn.1673-9418.1609004
2017-01-01
Abstract:As a key issue of data integration, schema matching is the core task in data merging process of heteroge-neous data sources. At present, a mass of schema matching methods have been proposed. However, most of them are lack of universality since they depend on the description information of schema heavily. Therefore, it is difficult to apply these approaches to other scenarios. To solve the problem, this paper proposes a novel schema matching method which uses ordered mutual information and does not rely on any description information of schema, such as column name, column type and foreign constraints, which make it own a strong universality. Furthermore, extensive experiments on various datasets indicate that the proposed technique outperforms earlier schema matching methods in terms of efficiency and accuracy.
What problem does this paper attempt to address?