A Hybrid Framework for Semantic Relation Extraction over Enterprise Data

Wei Shen,Jianyong Wang,Ping Luo,Min Wang
DOI: https://doi.org/10.4018/ijswis.2015070101
2015-01-01
Abstract:Relation extraction from the Web data has attracted a lot of attention in recent years. However, little work has been done when it comes to relation extraction from the enterprise data regardless of the urgent needs to such work in real applications (e.g., E-discovery). One distinct characteristic of the enterprise data (in comparison with the Web data) is its low redundancy. Previous work on relation extraction from the Web data largely relies on the data's high redundancy level and thus cannot be applied to the enterprise data effectively. This paper proposes an unsupervised hybrid framework called REACTOR. REACTOR combines a statistical method, classification, and clustering to identify various types of relations among entities appearing in the enterprise data automatically. Furthermore, the authors explore to apply pronominal anaphora resolution to extract more relations expressed across multiple sentences. They evaluate REACTOR over a real-world enterprise data set from HP that contains over three million pages and the experimental results show the effectiveness of REACTOR.
What problem does this paper attempt to address?