A Survey on Open Information Extraction from Rule-based Model to Large Language Model (meta)

Pai Liu,Wenyang Gao,Wenjie Dong,Lin Ai,Ziwei Gong,Songfang Huang,Zongsheng Li,Ehsan Hoque,Julia Hirschberg,Yue Zhang

2024-04-26

Abstract:Open Information Extraction (OpenIE) represents a crucial NLP task aimed at deriving structured information from unstructured text, unrestricted by relation type or domain. This survey paper provides an overview of OpenIE technologies spanning from 2007 to 2024, emphasizing a chronological perspective absent in prior surveys. It examines the evolution of task settings in OpenIE to align with the advances in recent technologies. The paper categorizes OpenIE approaches into rule-based, neural, and pre-trained large language models, discussing each within a chronological framework. Additionally, it highlights prevalent datasets and evaluation metrics currently in use. Building on this extensive review, the paper outlines potential future directions in terms of datasets, information sources, output formats, methodologies, and evaluation metrics.

Computation and Language

What problem does this paper attempt to address?

The paper attempts to address the issue of providing a comprehensive review of the technological development and application in the field of Open Information Extraction (OpenIE). Specifically, the paper aims to: 1. **Provide an overview of OpenIE technology from a timeline perspective**: From 2007 to 2024, the paper reviews the development history of OpenIE technology in detail, highlighting the impact of technological advancements on task settings. 2. **Classify and discuss OpenIE methods**: The paper categorizes OpenIE methods into rule-based methods, neural network methods, and pre-trained large language model (LLMs) methods, and discusses each method on the timeline. 3. **Evaluate existing datasets and evaluation metrics**: The paper outlines the currently commonly used OpenIE datasets and evaluation metrics, providing references for researchers. 4. **Look ahead to future directions**: Based on a comprehensive review of existing technologies and methods, the paper proposes potential future development directions in terms of datasets, information sources, output formats, methodologies, and evaluation metrics. Through these efforts, the paper not only fills the gaps in the existing literature but also provides researchers with a clear research roadmap to promote further development in the field of OpenIE.

A Survey on Open Information Extraction from Rule-based Model to Large Language Model (meta)

A Survey on Neural Open Information Extraction: Current Status and Future Directions

Assessing the Performance of Chinese Open Source Large Language Models in Information Extraction Tasks

Large Language Models for Generative Information Extraction: A Survey

Rules still work for Open Information Extraction

OpenUE: an Open Toolkit of Universal Extraction from Text

milIE: Modular & Iterative Multilingual Open Information Extraction

Mastering the Task of Open Information Extraction with Large Language Models and Consistent Reasoning Environment

Efficient Data Learning for Open Information Extraction with Pre-trained Language Models

Transformer Based Network for Open Information Extraction

DocOIE: A Document-level Context-Aware Dataset for OpenIE

Open Information Extraction: A Review of Baseline Techniques, Approaches, and Applications

A Survey of Document-Level Information Extraction

Generative adversarial networks for open information extraction

Multi-Round Parsing-based Multiword Rules for Scientific OpenIE

A Meta Learning Approach for Open Information Extraction

An Empirical Study on Information Extraction using Large Language Models

Dave: Extracting Domain Attributes And Values From Text Corpus

Open information extraction from the web

IELM: an Open Information Extraction Benchmark for Pre-Trained Language Models