A Distributed Rdf Storage And Query Model Based On Hbase

Keran Li,Bin Wu,Bai Wang
DOI: https://doi.org/10.1007/978-3-319-23531-8_1
2015-01-01
Abstract:Now we are living in an interconnected world and the amount of heterogeneous information data such as RDF is continually increasing. A lot has been done to find the solution to manage huge amount of RDF data. The solutions based on RDBMS have significant scalability issues considering the magnitude of data in modern time. In this paper we describe our solution to store and query RDF data in the cloud based on HBase and MapReduce. A vertical-partitioning-like model is used in HBase to reduce the table size and to obtain a good performance of SPARQL query. For complex query on large data, we propose to use cascading MapReduce job on HBase to enhance efficiency. Our experiments on LUBM show that our system can store large RDF graphs and can obtain good query efficiency.
What problem does this paper attempt to address?