Implementation and Optimization of RDF Query Using Hadoop.

Yanwen Chen,Fabrice Huet,Yixiang Chen
DOI: https://doi.org/10.5220/0003387805120515
2011-01-01
Abstract:With the prevalence of semantic web, a great deal of RDF data is created and has reached to tens of petabytes, which attracts people to pay more attention to processing data with high performance. In recent years, Hadoop, building on MapReduce framework, provides us a good way to process massive data in parallel. In this paper, we focus on using Hadoop to query RDF data from large data repositories. First, we proposed a prototype to process a SPARQL query. Then, we represented several ways to optimize our solution. Result shows that a better performance has been achieved, almost 70% improvement due to the
What problem does this paper attempt to address?