Denoising Relation Extraction from Document-level Distant Supervision

Chaojun Xiao,Yuan Yao,Ruobing Xie,Xu Han,Zhiyuan Liu,Maosong Sun,Fen Lin,Leyu Lin
DOI: https://doi.org/10.18653/v1/2020.emnlp-main.300
2020-01-01
Abstract:Distant supervision (DS) has been widely used to generate auto-labeled data for sentence-level relation extraction (RE), which improves RE performance. However, the existing success of DS cannot be directly transferred to the more challenging document-level relation extraction (DocRE), since the inherent noise in DS may be even multiplied in document level and significantly harm the performance of RE. To address this challenge, we propose a novel pre-trained model for DocRE, which denoises the document-level DS data via multiple pre-training tasks. Experimental results on the large-scale DocRE benchmark show that our model can capture useful information from noisy DS data and achieve promising results.
What problem does this paper attempt to address?