A Parallel Hierarchical Aggregation Algorithm in High Dimensional Data Warehouse

Kongfa Hu,Jiajia Liu,Ling Chen,Qingli Da
DOI: https://doi.org/10.1109/FSKD.2007.106
2007-01-01
Abstract:OLAP (on-line analytical processing) queries tend to be complex and ad hoc, often requiring computationally expensive operations such as multi-table joins and aggregation. In the high dimensional data warehouse(DW), we full materialized the data cube impossibly. In this paper, we propose a novel aggregation algorithm, PDHEPA (parallel pre-grouping aggregation based on the dimension hierarchical encoding), to vertically partition a high dimensional dataset into a set of disjoint low dimensional datasets called fragment mini-cubes. PDHEPA uses the small dimension hierarchical encoding and their prefix, so that it can drastically reduce the multi-table join operations. As a result, the method we proposed in this paper can greatly reduce the disk I/Os and highly improve the efficiency of OLAP queries. The analytical and experimental results show that the PDHEPA is more efficient than other existed ones.
What problem does this paper attempt to address?