GStar: an Efficient Framework for Answering Top-K Star Queries on Billion-Node Knowledge Graphs

Jiahui Jin,Junzhou Luo,Samamon Khemmarat,Fang Dong,Lixin Gao
DOI: https://doi.org/10.1007/s11280-018-0611-0
2018-01-01
World Wide Web
Abstract:Massive knowledge graphs, such as Linked Open Data or Freebase, contain billions of labeled entities and relationships. Star queries aim to identify an entity given a set of related entities, and they are common with massive knowledge graphs. It is important to find the best way to answer star queries, and we can do this by treating it as a graph pattern-matching problem. Because knowledge graphs are noisy and incomplete in nature, we must find answers that match the star pattern closely, and extract a precise match if possible. Thus, here we propose GStar, a framework to identify the top- k best answers for a star query. GStar effectively and efficiently answers top- k star queries on billion-node graphs through a novel query model, an index-free query algorithm, and a distributed query system. We evaluate GStar through experiments on real-world knowledge graphs. Experimental results show that our query model effectively answers real-life star-pattern queries; our query algorithm can answer top- k queries in a near-real-time manner without requiring expensive graph indices; and the distributed system scales well with both the graph size and number of machines used for computation.
What problem does this paper attempt to address?