Abstract:Most available data is unstructured, making it challenging to access valuable information. Automatically building Knowledge Graphs (KGs) is crucial for structuring data and making it accessible, allowing users to search for information effectively. KGs also facilitate insights, inference, and reasoning. Traditional NLP methods, such as named entity recognition and relation extraction, are key in information retrieval but face limitations, including the use of predefined entity types and the need for supervised learning. Current research leverages large language models' capabilities, such as zero- or few-shot learning. However, unresolved and semantically duplicated entities and relations still pose challenges, leading to inconsistent graphs and requiring extensive post-processing. Additionally, most approaches are topic-dependent. In this paper, we propose iText2KG, a method for incremental, topic-independent KG construction without post-processing. This plug-and-play, zero-shot method is applicable across a wide range of KG construction scenarios and comprises four modules: Document Distiller, Incremental Entity Extractor, Incremental Relation Extractor, and Graph Integrator and Visualization. Our method demonstrates superior performance compared to baseline methods across three scenarios: converting scientific papers to graphs, websites to graphs, and CVs to graphs.

What problem does this paper attempt to address?

### Problems the Paper Attempts to Solve The paper aims to address the problem of automatically constructing knowledge graphs (KGs) from unstructured text data. Specifically, it attempts to overcome the following challenges: 1. **Limitations of Traditional Methods**: - Traditional natural language processing (NLP) methods such as named entity recognition (NER) and relation extraction (RE) are limited by predefined entity types and relationships, and typically rely on supervised learning, which requires a large amount of manual annotation. 2. **Issues with Current Methods**: - Current methods based on large language models (LLMs) perform well in zero-shot or few-shot learning but still face unresolved issues of entity and relationship duplication, leading to inconsistent graphs and requiring extensive post-processing. Additionally, many methods are topic-specific and cannot be widely applied across different domains. 3. **Proposed New Method**: - A new method named iText2KG is proposed for incrementally constructing consistent knowledge graphs from raw documents without the need for post-processing steps. This method is a plug-and-play zero-shot approach suitable for a wide range of KG construction scenarios. iText2KG consists of four modules: - **Document Distiller**: Rewrites raw documents into semantic blocks. - **Incremental Entity Extractor**: Extracts and parses entities from the semantic blocks. - **Incremental Relation Extractor**: Detects relationships within the semantic blocks. - **Graph Integrator and Visualization**: Uses Neo4j to visualize these entities and relationships in graph format. In this way, iText2KG aims to improve the efficiency and accuracy of KG construction, reduce redundant information, and ensure the uniqueness and consistency of entities and relationships.

iText2KG: Incremental Knowledge Graphs Construction Using Large Language Models

SAC-KG: Exploiting Large Language Models as Skilled Automatic Constructors for Domain Knowledge Graphs

Graphusion: A RAG Framework for Knowledge Graph Construction with a Global Perspective

AutoKG: Efficient Automated Knowledge Graph Generation for Language Models

Combining Knowledge Graphs and Large Language Models

Distill-SynthKG: Distilling Knowledge Graph Synthesis Workflow for Improved Coverage and Efficiency

Enhancing Knowledge Graph Construction Using Large Language Models

Graphusion: Leveraging Large Language Models for Scientific Knowledge Graph Fusion and Construction in NLP Education

Combining large language models with enterprise knowledge graphs: a perspective on enhanced natural language understanding

Generating Knowledge Graphs from Large Language Models: A Comparative Study of GPT-4, LLaMA 2, and BERT

Construction of Knowledge Graphs: State and Challenges

Docs2KG: Unified Knowledge Graph Construction from Heterogeneous Documents Assisted by Large Language Models

KG-GPT: A General Framework for Reasoning on Knowledge Graphs Using Large Language Models

Text-Graph Enhanced Knowledge Graph Representation Learning.

KICGPT: Large Language Model with Knowledge in Context for Knowledge Graph Completion

KGValidator: A Framework for Automatic Validation of Knowledge Graph Construction

Enhancing Knowledge Graph Consistency through Open Large Language Models: A Case Study

Seq2KG: An End-to-End Neural Model for Domain Agnostic Knowledge Graph (not Text Graph) Construction from Text

Using Large Language Models for Zero-Shot Natural Language Generation from Knowledge Graphs

End-to-End NLP Knowledge Graph Construction