KGTK: A Toolkit for Large Knowledge Graph Manipulation and Analysis

Filip Ilievski,Daniel Garijo,Hans Chalupsky,Naren Teja Divvala,Yixiang Yao,Craig Rogers,Rongpeng Li,Jun Liu,Amandeep Singh,Daniel Schwabe,Pedro Szekely
DOI: https://doi.org/10.48550/arXiv.2006.00088
2021-05-26
Abstract:Knowledge graphs (KGs) have become the preferred technology for representing, sharing and adding knowledge to modern AI applications. While KGs have become a mainstream technology, the RDF/SPARQL-centric toolset for operating with them at scale is heterogeneous, difficult to integrate and only covers a subset of the operations that are commonly needed in data science applications. In this paper we present KGTK, a data science-centric toolkit designed to represent, create, transform, enhance and analyze KGs. KGTK represents graphs in tables and leverages popular libraries developed for data science applications, enabling a wide audience of developers to easily construct knowledge graph pipelines for their applications. We illustrate the framework with real-world scenarios where we have used KGTK to integrate and manipulate large KGs, such as Wikidata, DBpedia and ConceptNet.
Artificial Intelligence,Databases
What problem does this paper attempt to address?