Abstract:Open Information Extraction (OIE) aims to extract objective structured knowledge from natural texts, which has attracted growing attention to build dedicated models with human experience. As the large language models (LLMs) have exhibited remarkable in-context learning capabilities, a question arises as to whether the task of OIE can be effectively tackled with this paradigm? In this paper, we explore solving the OIE problem by constructing an appropriate reasoning environment for LLMs. Specifically, we first propose a method to effectively estimate the discrepancy of syntactic distribution between a LLM and test samples, which can serve as correlation evidence for preparing positive demonstrations. Upon the evidence, we introduce a simple yet effective mechanism to establish the reasoning environment for LLMs on specific tasks. Without bells and whistles, experimental results on the standard CaRB benchmark demonstrate that our $6$-shot approach outperforms state-of-the-art supervised method, achieving an $55.3$ $F_1$ score. Further experiments on TACRED and ACE05 show that our method can naturally generalize to other information extraction tasks, resulting in improvements of $5.7$ and $6.8$ $F_1$ scores, respectively.

What problem does this paper attempt to address?

### The Problem the Paper Attempts to Solve The paper attempts to solve the problem of how to effectively utilize large language models (LLMs) to handle open information extraction (OIE) tasks. Specifically, the authors explore improving the performance of LLMs on OIE tasks by constructing an appropriate inference environment. The core question of the paper is: Can the performance of OIE tasks be significantly improved by constructing a consistent inference environment that reduces the syntactic distribution differences between test samples and LLMs? ### Background and Motivation Open Information Extraction (OIE) aims to extract objective structured knowledge from natural text, a task that has attracted increasing attention from researchers. Traditional OIE methods typically rely on manual experience and statistical or rule-based models. However, as large language models (LLMs) demonstrate powerful contextual learning capabilities, researchers are beginning to consider whether LLMs can be effectively utilized to solve OIE tasks. ### Methods and Techniques 1. **Estimating Syntactic Distribution Differences**: - The authors propose a method to estimate the syntactic distribution differences between black-box LLMs and test samples. By calculating the syntactic distance between source sentences and their target sentences, the authors can quantify this difference. - Hierarchical Weighted Syntax (HWS) distance is used as a metric to calculate the syntactic differences of each sentence. 2. **Constructing a Consistent Inference Environment**: - Based on the estimated syntactic distribution differences, the authors further propose a simple yet effective mechanism to construct a consistent inference environment. - This environment includes a majority of examples similar in distribution to the test samples, along with a few variant examples to ensure consistency and diversity. 3. **Experimental Validation**: - The authors conducted extensive experiments on the standard OIE benchmark CaRB, including 3 to 7-shot experiments and controlled variable experiments. - The experimental results show that the proposed method significantly improves the few-shot inference performance of LLMs on OIE tasks, with the 6-shot results surpassing current supervised learning models, achieving an F1 score of 55.3. ### Experimental Results - **OIE Task**: - On the CaRB benchmark, the authors' method surpassed the current fully supervised models in the 6-shot setting, achieving an F1 score of 55.3. - As the candidate example set increases, performance gradually improves, with the largest set of 4932 examples achieving the best F1 score. - **Relation Extraction (RE) Task**: - On the TACRED benchmark, the authors' method improved ChatGPT's performance by approximately 6% in F1 score in the 30-shot setting. - Even compared to state-of-the-art supervised models, the authors' method narrowed the performance gap significantly (from 78% to 89%). - **Event Extraction (EE) Task**: - On the ACE05 benchmark, the authors' method improved ChatGPT's performance by 7.4% in F1 score in the 50-shot setting. - These results confirm the effectiveness of the proposed method on the EE task. ### Conclusion The paper significantly improves the few-shot inference performance of LLMs on tasks such as OIE, RE, and EE by constructing a consistent inference environment. This method not only provides new insights for utilizing LLMs to solve NLP tasks but also lays the foundation for future related research.

Mastering the Task of Open Information Extraction with Large Language Models and Consistent Reasoning Environment

Concise and Organized Perception Facilitates Large Language Models for Deductive Reasoning.

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

Assessing the Performance of Chinese Open Source Large Language Models in Information Extraction Tasks

Evidence-based Interpretable Open-domain Fact-checking with Large Language Models

Information Re-Organization Improves Reasoning in Large Language Models

A Survey on Open Information Extraction from Rule-based Model to Large Language Model (meta)

Self-prompted Chain-of-Thought on Large Language Models for Open-domain Multi-hop Reasoning

ADELIE: Aligning Large Language Models on Information Extraction

Concise and Organized Perception Facilitates Reasoning in Large Language Models

Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models

CLR-Fact: Evaluating the Complex Logical Reasoning Capability of Large Language Models over Factual Knowledge

LINKED: Eliciting, Filtering and Integrating Knowledge in Large Language Model for Commonsense Reasoning

Efficient Data Learning for Open Information Extraction with Pre-trained Language Models

IELM: an Open Information Extraction Benchmark for Pre-Trained Language Models

Benchmarking Large Language Models with Augmented Instructions for Fine-grained Information Extraction

MMM: Multilingual Mutual Reinforcement Effect Mix Datasets & Test with Open-domain Information Extraction Large Language Models

Reasoning in Conversation: Solving Subjective Tasks through Dialogue Simulation for Large Language Models

Exchange-of-Thought: Enhancing Large Language Model Capabilities through Cross-Model Communication

REL: Working out is all you need