Enriching Relations with Additional Attributes for ER

Mengyi Yan,Wenfei Fan,Yaoshu Wang,Min Xie
DOI: https://doi.org/10.14778/3681954.3681987
IF: 2.5
2024-07-01
Proceedings of the VLDB Endowment
Abstract:This paper studies a new problem of relation enrichment. Given a relation D of schema R and a knowledge graph G with overlapping information, it is to identify a small number of relevant features from G , and extend schema R with the additional attributes, to maximally improve the accuracy of resolving entities represented by the tuples of D. We formulate the enrichment problem and show its intractability. Nonetheless, we propose a method to extract features from G that are diverse from the existing attributes of R , minimize null values, and moreover, reduce false positives and false negatives of entity resolution (ER) models. The method links tuples and vertices that refer to the same entity, learns a robust policy to extract attributes via reinforcement learning, and jointly trains the policy and ER models. Moreover, we develop algorithms for (incrementally) enriching D. Using real-life data, we experimentally verify that relation enrichment improves the accuracy of ER above 15.4% (percentage points) by adding 5 attributes, up to 33%.
computer science, information systems, theory & methods
What problem does this paper attempt to address?