Mastering the Task of Open Information Extraction with Large Language Models and Consistent Reasoning Environment

Ji Qi,Kaixuan Ji,Xiaozhi Wang,Jifan Yu,Kaisheng Zeng,Lei Hou,Juanzi Li,Bin Xu
2023-10-17
Abstract:Open Information Extraction (OIE) aims to extract objective structured knowledge from natural texts, which has attracted growing attention to build dedicated models with human experience. As the large language models (LLMs) have exhibited remarkable in-context learning capabilities, a question arises as to whether the task of OIE can be effectively tackled with this paradigm? In this paper, we explore solving the OIE problem by constructing an appropriate reasoning environment for LLMs. Specifically, we first propose a method to effectively estimate the discrepancy of syntactic distribution between a LLM and test samples, which can serve as correlation evidence for preparing positive demonstrations. Upon the evidence, we introduce a simple yet effective mechanism to establish the reasoning environment for LLMs on specific tasks. Without bells and whistles, experimental results on the standard CaRB benchmark demonstrate that our $6$-shot approach outperforms state-of-the-art supervised method, achieving an $55.3$ $F_1$ score. Further experiments on TACRED and ACE05 show that our method can naturally generalize to other information extraction tasks, resulting in improvements of $5.7$ and $6.8$ $F_1$ scores, respectively.
Computation and Language
What problem does this paper attempt to address?
### The Problem the Paper Attempts to Solve The paper attempts to solve the problem of how to effectively utilize large language models (LLMs) to handle open information extraction (OIE) tasks. Specifically, the authors explore improving the performance of LLMs on OIE tasks by constructing an appropriate inference environment. The core question of the paper is: Can the performance of OIE tasks be significantly improved by constructing a consistent inference environment that reduces the syntactic distribution differences between test samples and LLMs? ### Background and Motivation Open Information Extraction (OIE) aims to extract objective structured knowledge from natural text, a task that has attracted increasing attention from researchers. Traditional OIE methods typically rely on manual experience and statistical or rule-based models. However, as large language models (LLMs) demonstrate powerful contextual learning capabilities, researchers are beginning to consider whether LLMs can be effectively utilized to solve OIE tasks. ### Methods and Techniques 1. **Estimating Syntactic Distribution Differences**: - The authors propose a method to estimate the syntactic distribution differences between black-box LLMs and test samples. By calculating the syntactic distance between source sentences and their target sentences, the authors can quantify this difference. - Hierarchical Weighted Syntax (HWS) distance is used as a metric to calculate the syntactic differences of each sentence. 2. **Constructing a Consistent Inference Environment**: - Based on the estimated syntactic distribution differences, the authors further propose a simple yet effective mechanism to construct a consistent inference environment. - This environment includes a majority of examples similar in distribution to the test samples, along with a few variant examples to ensure consistency and diversity. 3. **Experimental Validation**: - The authors conducted extensive experiments on the standard OIE benchmark CaRB, including 3 to 7-shot experiments and controlled variable experiments. - The experimental results show that the proposed method significantly improves the few-shot inference performance of LLMs on OIE tasks, with the 6-shot results surpassing current supervised learning models, achieving an F1 score of 55.3. ### Experimental Results - **OIE Task**: - On the CaRB benchmark, the authors' method surpassed the current fully supervised models in the 6-shot setting, achieving an F1 score of 55.3. - As the candidate example set increases, performance gradually improves, with the largest set of 4932 examples achieving the best F1 score. - **Relation Extraction (RE) Task**: - On the TACRED benchmark, the authors' method improved ChatGPT's performance by approximately 6% in F1 score in the 30-shot setting. - Even compared to state-of-the-art supervised models, the authors' method narrowed the performance gap significantly (from 78% to 89%). - **Event Extraction (EE) Task**: - On the ACE05 benchmark, the authors' method improved ChatGPT's performance by 7.4% in F1 score in the 50-shot setting. - These results confirm the effectiveness of the proposed method on the EE task. ### Conclusion The paper significantly improves the few-shot inference performance of LLMs on tasks such as OIE, RE, and EE by constructing a consistent inference environment. This method not only provides new insights for utilizing LLMs to solve NLP tasks but also lays the foundation for future related research.