KIF: A Wikidata-Based Framework for Integrating Heterogeneous Knowledge Sources

Guilherme Lima,João M. B. Rodrigues,Marcelo Machado,Elton Soares,Sandro R. Fiorini,Raphael Thiago,Leonardo G. Azevedo,Viviane T. da Silva,Renato Cerqueira
2024-07-24
Abstract:We present a Wikidata-based framework, called KIF, for virtually integrating heterogeneous knowledge sources. KIF is written in Python and is released as open-source. It leverages Wikidata's data model and vocabulary plus user-defined mappings to construct a unified view of the underlying sources while keeping track of the context and provenance of their statements. The underlying sources can be triplestores, relational databases, CSV files, etc., which may or may not use the vocabulary and RDF encoding of Wikidata. The end result is a virtual knowledge base which behaves like an "extended Wikidata" and which can be queried using a simple but expressive pattern language, defined in terms of Wikidata's data model. In this paper, we present the design and implementation of KIF, discuss how we have used it to solve a real integration problem in the domain of chemistry (involving Wikidata, PubChem, and IBM CIRCA), and present experimental results on the performance and overhead of KIF
Artificial Intelligence,Databases
What problem does this paper attempt to address?