Multi-Way Theta-Join Based On Cmd Storage Method

Lei Li,Hong Gao,Mingrui Zhu,Zhaonian Zou
DOI: https://doi.org/10.1007/978-3-319-05810-8_5
2014-01-01
Abstract:In the era of the Big Data, how to analyze such a vast quantity of data is a challenging problem, and conducting a multi-way theta-join query is one of the most time consuming operations. MapReduce has been mentioned most in the massive data processing area and some join algorithms based on it have been raised in recent years. However, MapReduce paradigm itself may not be suitable to some scenarios and multi-way theta-join seems to be one of them. Many multi- way theta-join algorithms on traditional parallel database have been raised for many years, but no algorithm has been mentioned on the CMD (coordinate modulo distribution) storage method, although some algorithms on equal-join have been proposed. In this paper, we proposed a multi-way theta-join method based on CMD, which takes the advantage of the CMD storage method. Experiments suggest that it's a valid and efficient method which achieves significant improvement compared to those applied on the MapReduce.
What problem does this paper attempt to address?