Switch-SSD Cache Based XML Query Processing in Hadoop

Changlong Zhou,Minghua Jiang,Feng Yu
DOI: https://doi.org/10.1109/wartia.2014.6976477
2014-01-01
Abstract:Hadoop as open source software that implements the MapReduce framework is an ideal solution to speed up a XML parallel query processing. We proposed a distributed caching architecture in Hadoop cluster, called switch-SSD which cache XML query results en-route in the network switching nodes. Switch-SSD extends extend OpenFlow switches limited memory space with SSD for caching XML query results in the switch. We design an OpenFlow controller as a cache Manager conducting the switch-SSDs. At the help of the controller, the switch-SSD intercepts the query request and proactively sends the caching results to the client rather than a client conducts cache read operation. By caching the results, switch-SSD reduces calculation of query and lowers the job execution times in Hadoop cluster. Experimental results show that switch-SSD can improve the efficiency of most existing XML parallel query processing in Hadoop cluster.
What problem does this paper attempt to address?