Wikipedia-Based Entity Semantifying in Open Information Extraction

Qiuhao Lu,Youtian Du
DOI: https://doi.org/10.1109/icdar.2017.130
2017-01-01
Abstract:In the recent years, Open Information Extraction (OIE), an unsupervised strategy which extracts open-domain facts of knowledge from massive heterogeneous text corpora, has achieved impressive improvements. However, the facts (generally represented by a triple) extracted by OIE systems are in lack of clear semantics and then difficult for computer systems to understand. In this paper, we present a new method to semantify the facts by mapping the string arguments in the triples to the corresponding real-world entities based on the existing knowledge base Wikipedia. First, for each query of string argument, we consider a set of its most likely mapping entities and assign each candidate a fused prior probability. Then we calculate the graph-based similarity between candidates as the contextual evidence by propagating semantics on the neighborhood graph of candidates. Finally, we transform the mapping task into an optimization problem and find the maximum a posteriori (MAP) mapping by combining the prior information and contextual evidence through Bayes' theorem. Due to the fusion of multiple cues and the semantics propagation over the graph, our approach improves the performance of the entity semantifying. Experimental results demonstrate the effectiveness of our approach.
What problem does this paper attempt to address?