Dynamic Label Correction for Distant Supervision Relation Extraction via Semantic Similarity.

Xinyu Zhu,Gongshen Liu,Bo Su,Jan Pan Nees
DOI: https://doi.org/10.1007/978-3-030-32236-6_2
2019-01-01
Abstract:It was found that relation extraction (RE) suffered from the lack of data. A widely used solution is to use distant supervision, but it brings many wrong labeled sentences. Previous work performed bag-level training to reduce the effect of noisy data. However, these methods are suboptimal because they cannot handle the situation where all the sentences in a bag are wrong labeled. The best way to reduce noise is to recognize the wrong labels and correct them. In this paper, we propose a novel model focusing on dynamically correcting wrong labels, which can train models at sentence level and improve the quality of the dataset without reducing its quantity. A semantic similarity module and a new label correction algorithm are designed. We combined semantic similarity and classification probability to evaluate the original label, and correct it if it is wrong. The proposed method works as an additional module that can be applied to any classification models. Experiments show that the proposed method can accurately correct wrong labels, both false positive and false negative, and greatly improve the performance of relation classification comparing to state-of-the-art systems.
What problem does this paper attempt to address?