A Survey on State-of-the-art Techniques for Knowledge Graphs Construction and Challenges ahead

Ali Hur,Naeem Janjua,Mohiuddin Ahmed
DOI: https://doi.org/10.48550/arXiv.2110.08012
2021-12-31
Abstract:Global datasphere is increasing fast, and it is expected to reach 175 Zettabytes by 20251 . However, most of the content is unstructured and is not understandable by machines. Structuring this data into a knowledge graph enables multitudes of intelligent applications such as deep question answering, recommendation systems, semantic search, etc. The knowledge graph is an emerging technology that allows logical reasoning and uncovers new insights using content along with the context. Thereby, it provides necessary syntax and reasoning semantics that enable machines to solve complex healthcare, security, financial institutions, economics, and business problems. As an outcome, enterprises are putting their effort into constructing and maintaining knowledge graphs to support various downstream applications. Manual approaches are too expensive. Automated schemes can reduce the cost of building knowledge graphs up to 15-250 times. This paper critiques state-of-the-art automated techniques to produce knowledge graphs of near-human quality autonomously. Additionally, it highlights different research issues that need to be addressed to deliver high-quality knowledge graphs
Artificial Intelligence,Databases
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to construct a high - quality knowledge graph, especially the challenges encountered in the automated construction process. Specifically, the paper focuses on two main aspects: 1. **Completeness of the knowledge graph**: How to increase the coverage of the knowledge graph and ensure that all relevant entities and relationships are included. This involves discovering hidden facts and supplementing missing information, for example, by using link prediction techniques to identify different types of links, including general links, identity links and type links. 2. **Correctness of the knowledge graph**: How to improve the accuracy of the information in the knowledge graph and eliminate errors and contradictions. This includes fact verification and inconsistency repair. Fact verification refers to evaluating whether the statements in the knowledge graph are semantically correct and in line with the real world; inconsistency repair ensures that the information in the knowledge graph follows the axioms or limitations defined in the ontology. The paper reviews the current state - of - the - art automated techniques that aim to generate knowledge graphs autonomously with a quality close to that of humans, and points out different problems that need further research and solution, such as rule mining, pattern mining, statistical and probabilistic methods, embedding and neural network methods, etc., as well as their applications and limitations in improving the quality of knowledge graphs and solving related challenges.