LLM with Relation Classifier for Document-Level Relation Extraction

Xingzuo Li,Kehai Chen,Yunfei Long,Min Zhang
2024-08-26
Abstract:Large language models (LLMs) create a new paradigm for natural language processing. Despite their advancement, LLM-based methods still lag behind traditional approaches in document-level relation extraction (DocRE), a critical task for understanding complex entity relations. This paper investigates the causes of this performance gap, identifying the dispersion of attention by LLMs due to entity pairs without relations as a primary factor. We then introduce a novel classifier-LLM approach to DocRE. The proposed approach begins with a classifier specifically designed to select entity pair candidates exhibiting potential relations and thereby feeds them to LLM for the final relation extraction. This method ensures that during inference, the LLM's focus is directed primarily at entity pairs with relations. Experiments on DocRE benchmarks reveal that our method significantly outperforms recent LLM-based DocRE models and achieves competitive performance with several leading traditional DocRE models.
Computation and Language
What problem does this paper attempt to address?
The paper attempts to address the issue of large language models (LLMs) underperforming compared to traditional methods in the document-level relation extraction (DocRE) task. Specifically, the paper points out that when constructing candidate entity pairs, there is a large number of no-relation (NA) entity pairs, which leads to an imbalance in data distribution. This imbalance causes LLMs to be distracted when handling the DocRE task and unable to focus on those entity pairs that truly contain relationships. To solve this problem, the paper proposes a new method called LMRC, which introduces a pre-classifier to filter out potential relation-expressing entity pairs, and then uses LLMs for the final relation classification, thereby improving the model's performance on the DocRE task. Experimental results show that this method significantly outperforms other LLM-based methods on the DocRED and Re-DocRED datasets and is competitive with some leading traditional DocRE models.