Adaptive Reinforcement Learning Planning: Harnessing Large Language Models for Complex Information Extraction

Zepeng Ding,Ruiyang Ke,Wenhao Huang,Guochao Jiang,Yanda Li,Deqing Yang,Jiaqing Liang
2024-08-29
Abstract:Existing research on large language models (LLMs) shows that they can solve information extraction tasks through multi-step planning. However, their extraction behavior on complex sentences and tasks is unstable, emerging issues such as false positives and missing elements. We observe that decomposing complex extraction tasks and extracting them step by step can effectively improve LLMs' performance, and the extraction orders of entities significantly affect the final results of LLMs. This paper proposes a two-stage multi-step method for LLM-based information extraction and adopts the RL framework to execute the multi-step planning. We regard sequential extraction as a Markov decision process, build an LLM-based extraction environment, design a decision module to adaptively provide the optimal order for sequential entity extraction on different sentences, and utilize the DDQN algorithm to train the decision model. We also design the rewards and evaluation metrics suitable for the extraction results of LLMs. We conduct extensive experiments on multiple public datasets to demonstrate the effectiveness of our method in improving the information extraction capabilities of LLMs.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The paper aims to address the issue of unstable performance of large language models (LLMs) in complex information extraction tasks, specifically problems such as false positives and missing elements. The authors observed that breaking down complex extraction tasks into multiple steps can significantly improve the performance of LLMs, and the order of entity extraction has a significant impact on the final results. Therefore, the paper proposes a two-stage multi-step method to improve the information extraction performance based on LLMs. This method uses a reinforcement learning framework to execute multi-step planning, models sequential extraction as a Markov decision process, and designs a decision module to adaptively provide the optimal entity extraction order for different sentences. Additionally, the paper designs a reward mechanism and evaluation metrics suitable for assessing the extraction results of LLMs. Extensive experiments on multiple public datasets validate the effectiveness of the method. Overall, the core objective of the paper is to enhance the stability and effectiveness of LLMs in complex information extraction tasks through reinforcement learning without fine-tuning.