OLAP Aggregation Based on Dimension-oriented Storage

Zhao Jing-hua,Song Ai-mei,Song Ai-bo
DOI: https://doi.org/10.1109/ipdpsw.2012.241
2012-01-01
Abstract:OLAP (online analytical processing) applications are based on a variety of aggregate queries on large-scale data. As aggregation is always performed on columns, traditional row-oriented storage, in which all the columns of a data row are stored together, has seriously restricted its performance. This paper proposes a dimension-oriented storage model based on HBase, and a new parallel aggregation technique, which accomplishes aggregation operations with parallel MapReduce jobs. Finally, compared with Hive on standard TPC-H data set, our technique is demonstrated to improve performance of core aggregate operations significantly.
What problem does this paper attempt to address?