A Filter-Based Multi-Join Algorithm in Cloud Computing Environment

Wang Jing,Wang Tengjiao,Yang Dongqing,Li Hongyan
2011-01-01
Journal of Computer Research and Development
Abstract:Cloud computing is an important technology which is used widely for large-scale data analysis.However,the distribution nature of Cloud computing makes it costly to process multi-joins.This paper examines strategies for multi-joins in the Cloud computing environment and proposes a new efficient filter-based multi-join algorithm.Our new approach joins multiple tables concurrently and avoids unnecessary tuple replication by collecting minimal statistics.We present a performance study on TPC-H benchmark and compare our approach with Hive and two other multi-join algorithms.The experiment results show that our approach improves significantly the efficiency of multi-joins in the Cloud computing environment.
What problem does this paper attempt to address?