Mapping Patterns for Virtual Knowledge Graphs

Diego Calvanese,Avigdor Gal,Davide Lanti,Marco Montali,Alessandro Mosca,Roee Shraga
2023-08-11
Abstract:Virtual Knowledge Graphs (VKG) constitute one of the most promising paradigms for integrating and accessing legacy data sources. A critical bottleneck in the integration process involves the definition, validation, and maintenance of mappings that link data sources to a domain ontology. To support the management of mappings throughout their entire lifecycle, we propose a comprehensive catalog of sophisticated mapping patterns that emerge when linking databases to ontologies. To do so, we build on well-established methodologies and patterns studied in data management, data analysis, and conceptual modeling. These are extended and refined through the analysis of concrete VKG benchmarks and real-world use cases, and considering the inherent impedance mismatch between data sources and ontologies. We validate our catalog on the considered VKG scenarios, showing that it covers the vast majority of patterns present therein.
Artificial Intelligence,Databases
What problem does this paper attempt to address?
The paper primarily aims to address the issues encountered in Virtual Knowledge Graphs (VKG) when integrating and accessing legacy data sources, particularly concerning the definition, validation, and maintenance of mapping assertions that link data sources to domain ontologies. Specifically, the core contributions of the paper are: 1. **Identification of Mapping Patterns**: The authors establish a comprehensive catalog of high-level mapping patterns that occur when connecting databases to ontologies. These patterns are based on existing data management, data analysis, and conceptual modeling methods, and are extended and refined through specific VKG benchmarks and real-world cases. 2. **Addressing Challenges in the Mapping Lifecycle**: Managing the entire lifecycle of mappings in VKGs is a labor-intensive and mostly manual task that requires highly specialized skills. The proposed approach aims to support ontology engineers and knowledge scientists in creating VKG mappings, utilizing all relevant information resources to achieve maximum utility. 3. **Organization of Patterns**: The paper categorizes the patterns into two types: database structure-driven patterns (considering the database schema and its explicit constraints) and data-driven patterns (also considering the implicit constraints derived from specific data configurations in the database). 4. **Application of Patterns**: The proposed patterns can be used in various scenarios, such as validating existing mappings, generating mappings and ontologies (when only the database is available), and even serving as a basis for reconstructing implicit or inaccessible conceptual models. 5. **Evaluation of Patterns**: The paper also evaluates six different VKG scenarios, analyzing the mapping coverage and the reuse of each pattern in these scenarios. In summary, this paper aims to simplify and optimize the creation process of VKG mappings by providing a detailed set of mapping patterns, thereby lowering the technical barriers to implementing complex enterprise-level VKG solutions.