Coexistence of Multiple Partition Plan Based Physical Database Design.

Liming Dong ,Weidong Liu ,Jie Shao
DOI: https://doi.org/10.1145/3057109.3057111
2017-01-01
Abstract:Big Data has brought great challenges to traditional DBMS. Nearly all the Big Data management systems choose to handle Big Data by partitioning it according to some specific attributes and distributing the partitioned data in cluster. However, when data relationships and query requirements are exceedingly complex, it is difficult to decide which attributes should be chosen to partition data, because a data table can only be partitioned in exactly one specific way while different query requirements may need different partition plans which are in conflicts with each other. In the same time, replication is a common approach to obtain high fault tolerance. Consequently, it is a reasonable way to improve system performance by using multiple replicas to resolve partition conflicts.; AB@In this paper we analyzed the reasons of contradictions and presented a method to identify and resolve them. Our method first classifies queries into different categories by their requirements, and then uses partition algorithm to search the optimal partition plan for each query category. By introducing a two-tier server architecture, we could make more effective use of these replicas. TPC-E and TPC-H are chosen to evaluate our method, the evaluation results demonstrate that our method could improve system performance by up to 4x over single partition plan method.
What problem does this paper attempt to address?