Efficient m -closest entity matching over heterogeneous information networks

Wancheng Long,Xiaowen Li,Liping Wang,Fan Zhang,Zhe Lin,Xuemin Lin
DOI: https://doi.org/10.1016/j.knosys.2023.110299
IF: 8.139
2023-03-01
Knowledge-Based Systems
Abstract:This work investigates a novel m -closest entity ( m CE) matching problem over geographic heterogeneous information networks (Geo-HINs). That is, given a Geo-HIN G and m query graphs { q 1 , q 2 , … , q m } , m CE matching aims to find a group of geographic entities (geo-entities) whose patterns match the query graphs { q 1 , q 2 , … , q m } correspondingly, for which the maximum distance between any geo-entity pair (i.e., the diameter) in the group is minimized. As a fundamental problem, the m CE matching can be applied for many scenarios, e.g., travel itinerary recommendation and city planning. The existing works have not simultaneously considered the characteristics of patterns matching and spatial search so that they cannot solve our problem, which is computationally expensive. To solve this problem efficiently, we propose a unified framework named F u z z y − E x a c t framework to process entity matching and spatial search comprehensively, in which pruning abilities at non-spatial and spatial layers are cooperatively explored. Two mutually adaptive auxiliary data structures named A r c − T r e e and A r c − F o r e s t are devised to maintain the intermediate search results which are exploited to enhance the search process between non-spatial and spatial layers. Experimental results demonstrate that our algorithm can outperform the baseline methods by 2 orders of magnitude on runtime.
computer science, artificial intelligence
What problem does this paper attempt to address?