ChatIE: Zero-Shot Information Extraction via Chatting with ChatGPT

Xiang Wei,Xingyu Cui,Ning Cheng,Xiaobin Wang,Xin Zhang,Shen Huang,Pengjun Xie,Jinan Xu,Yufeng Chen,Meishan Zhang,Yong Jiang,Wenjuan Han
2024-05-27
Abstract:Zero-shot information extraction (IE) aims to build IE systems from the unannotated text. It is challenging due to involving little human intervention. Challenging but worthwhile, zero-shot IE reduces the time and effort that data labeling takes. Recent efforts on large language models (LLMs, e.g., GPT-3, ChatGPT) show promising performance on zero-shot settings, thus inspiring us to explore prompt-based methods. In this work, we ask whether strong IE models can be constructed by directly prompting LLMs. Specifically, we transform the zero-shot IE task into a multi-turn question-answering problem with a two-stage framework (ChatIE). With the power of ChatGPT, we extensively evaluate our framework on three IE tasks: entity-relation triple extract, named entity recognition, and event extraction. Empirical results on six datasets across two languages show that ChatIE achieves impressive performance and even surpasses some full-shot models on several datasets (e.g., NYT11-HRL). We believe that our work could shed light on building IE models with limited resources.
Computation and Language
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address the problem of Zero-Shot Information Extraction (IE). Specifically, the goal of zero-shot information extraction is to build an information extraction system from unannotated text, which is a challenging task because it involves minimal human intervention. However, zero-shot information extraction also has significant practical importance as it can greatly reduce the time and effort required for data annotation. ### Background and Motivation In recent years, large-scale language models (such as GPT-3 and ChatGPT) have shown outstanding performance in zero-shot settings, which has inspired researchers to explore prompt-based methods to build powerful information extraction models. The paper proposes a new method called ChatIE, which transforms the zero-shot information extraction task into a multi-turn question-answering problem and leverages the powerful capabilities of ChatGPT to achieve efficient information extraction. ### Method Overview ChatIE adopts a two-stage framework: 1. **First Stage**: Determine the types of elements that may exist in the sentence (such as entities, relationships, or events). Through a single question-answering round, filter out non-existent element types to reduce the search space and computational complexity. 2. **Second Stage**: Based on the element types extracted in the first stage, further extract relevant information. Each stage is implemented through multiple rounds of question-answering, with each round's prompts constructed based on designed templates and previously extracted information, ultimately combining the information extracted in each round into the final structured data. ### Experiments and Results The paper conducts extensive experiments on 6 datasets, including tasks such as relation extraction (RE), named entity recognition (NER), and event extraction (EE). The experimental results show that ChatIE achieves impressive performance on multiple datasets, even surpassing full-sample models on some datasets (e.g., NYT11-HRL). Additionally, ChatIE significantly outperforms other zero-shot and few-shot methods in zero-shot settings. ### Main Contributions 1. **Innovation**: For the first time, quantitatively explores the possibility of building zero-shot information extraction models by directly prompting large-scale language models. 2. **Effectiveness**: ChatIE demonstrates excellent performance across multiple tasks and datasets, showcasing its potential for building information extraction models in resource-limited scenarios. 3. **Generality**: The ChatIE framework is not only applicable to ChatGPT but can also be applied to other large-scale language models, exhibiting strong generality and extensibility. ### Conclusion The paper demonstrates how to efficiently perform information extraction in zero-shot settings using large-scale language models through the ChatIE framework. This method not only performs well across multiple tasks and datasets but also provides new ideas and directions for future research.