APIC: An Efficient Algorithm for Computing Iceberg Datacubes

L. Lakhal,Noël Novelli,R. Cicchetti
Abstract:OLAP databases are increasingly used and require handling multidimensional data in order of seconds. The cube operator was introduced to precompute aggregates in order to improve response time of aggregation queries. Data collected in data warehouses is frequently sparse and datacubes, costly to compute and specially voluminous when compared to the input size, can encompass many aggregated results not significant for decision makers. In order to avoid this drawback, the concept of iceberg datacube (answering iceberg queries) has been recently introduced by the algorithm BUC. Iceberg datacubes group aggregates satisfying a selection condition (i.e. SQL having clause). In this paper, we propose an approach for computing a condensed representation of either full and iceberg datacubes. We introduce a novel and sound characterization of datacubes based on dimensional-measurable partitions. Such partitions have an attractive advantage: avoiding sorting techniques which are replaced by a linear product of dimensional-measurable partitions. Moreover, our datacube characterization provides a logical condensed representation interesting when considering the storage explosion problem. We show that our approach turns out to an operational solution more efficient than previous proposals: the algorithm APIC. It enforces a lectic-wise traverse of the dimensional lattice and takes into account the critical problem of memory limitation. Our analytical and experimental performance study shows that APIC and BUC are promising candidates for scalable computation and the best efficiency of APIC.
Computer Science
What problem does this paper attempt to address?