Relation Extraction Method Combining Clause Level Distant Supervision and Semi-supervised Ensemble Learning

Xiaokang YU,Ling CHEN,Jing GUO,Yaya CAI,Yong WU,Jingchang WANG
DOI: https://doi.org/10.16451/j.cnki.issn1003-6059.201701006
2017-01-01
Abstract:Aiming at noisy data in training data and the insufficient use of negative instances in traditional distant supervision relation extraction methods, a relation extraction method combining clause level distant supervision and semi-supervised ensemble learning is proposed. Firstly, the relation instance set is generated by distant supervision. Secondly, based on clause identification, a denoising algorithm is used to reduce the wrongly labeled data in the relation instance set. Thirdly, the lexical features are extracted from relation instances and are transformed into distributed vectors to establish feature dataset. Finally, all positive data and part of negative data in feature dataset are chosen to form labeled dataset, and the other part of negative data are chosen to form unlabeled dataset. A relation classifier is trained through improved semi-supervised ensemble learning algorithm. Experiments show that compared with baseline methods the proposed method achieves higher accuracies and recall.
What problem does this paper attempt to address?