Distilling the Documents for Relation Extraction by Topic Segmentation

Minghui Wang,Ping Xue,Ying Li,Zhonghai Wu
DOI: https://doi.org/10.1007/978-3-030-86549-8_33
2021-01-01
Abstract:Sentence-level relation extraction (RE) is always hard to identify the relations across sentences. Thus, researchers are now moving to the document-level RE. Since a document can be regarded as a long sequence connected by multiple sentences, document-level RE obtains cross-sentence relations by reasoning and aggregating information between entity mentions. However, document-level RE faces the low-efficiency issue. Intuitively, encoding the whole document is undoubtedly a more expensive action than encoding a single sentence. And we notice that in the process of relation extraction, documents always contain much irrelevant content, which not only wastes encoding time, but also brings potential noise information. Based on this observation, we propose a novel framework to identify and discard such irrelevant content. It integrates the topic segmentation to distill the document into topically coherent segments with concise and comprehensive information of entities. To further utilize these segments, we propose a topic enhancement module in the framework to enhance the predicted relations. The experimental results on two datasets indicate that our framework outperforms all the baseline models. Furthermore, our framework can increase the speed by 3.06 times and reduce the memory consumption by 2.90 times while maximizing the F1 score.
What problem does this paper attempt to address?