Construction of Knowledge Graphs: State and Challenges

Marvin Hofer,Daniel Obraczka,Alieh Saeedi,Hanna Köpcke,Erhard Rahm
2023-10-11
Abstract:With knowledge graphs (KGs) at the center of numerous applications such as recommender systems and question answering, the need for generalized pipelines to construct and continuously update such KGs is increasing. While the individual steps that are necessary to create KGs from unstructured (e.g. text) and structured data sources (e.g. databases) are mostly well-researched for their one-shot execution, their adoption for incremental KG updates and the interplay of the individual steps have hardly been investigated in a systematic manner so far. In this work, we first discuss the main graph models for KGs and introduce the major requirement for future KG construction pipelines. Next, we provide an overview of the necessary steps to build high-quality KGs, including cross-cutting topics such as metadata management, ontology development, and quality assurance. We then evaluate the state of the art of KG construction w.r.t the introduced requirements for specific popular KGs as well as some recent tools and strategies for KG construction. Finally, we identify areas in need of further research and improvement.
Artificial Intelligence,Databases,Machine Learning
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to explore the current situation and challenges in the construction process of knowledge graphs (Knowledge Graphs, KGs). Specifically, the paper focuses on the following aspects: 1. **Incremental update**: The existing knowledge graph construction processes are usually executed in a batch - processing manner, and it is difficult to incorporate new information without completely recalculating. The paper explores how to design pipelines that can continuously update knowledge graphs to adapt to the continuous change of information. 2. **Automation and manual intervention**: In the construction process of knowledge graphs, many steps still require human intervention, which limits their ability to process large - scale data and update speed. The paper discusses how to achieve a higher degree of automation and reduce the need for manual intervention. 3. **Data quality and integrity**: The quality of knowledge graphs depends on multiple dimensions, such as correctness, timeliness, comprehensiveness and conciseness. The paper proposes the requirements for constructing high - quality knowledge graphs and evaluates the performance of existing methods in this regard. 4. **Inter - disciplinary requirements**: The construction of knowledge graphs involves multiple research fields, such as natural language processing (NLP), data integration, knowledge representation and knowledge management. The paper emphasizes the importance of the expertise in these fields in constructing efficient knowledge graphs. 5. **Tools and methods**: The paper outlines the steps required to construct high - quality knowledge graphs, evaluates the existing tools and methods, and points out the shortcomings of the current methods and the future research directions. ### Main contributions - **Define requirements**: The paper clearly defines the main requirements for constructing and maintaining knowledge graphs, which can be used as a guide for evaluating existing solutions and identifying open challenges. - **Compare existing methods**: The paper selects 23 knowledge - graph - specific construction methods and general tool sets, and evaluates and compares them according to the above requirements. - **Identify open challenges**: The paper identifies the areas that still need further research and improvement in the construction of knowledge graphs, providing directions for future research. ### Conclusion By systematically analyzing the current situation and challenges of knowledge graph construction, the paper provides valuable guidance for researchers, engineers and experts, helping them better understand and deal with the complex problems in knowledge graph construction.