Abstract:OLAP databases are increasingly used and require handling multidimensional data in order of seconds. The cube operator was introduced to precompute aggregates in order to improve response time of aggregation queries. Data collected in data warehouses is frequently sparse and datacubes, costly to compute and specially voluminous when compared to the input size, can encompass many aggregated results not significant for decision makers. In order to avoid this drawback, the concept of iceberg datacube (answering iceberg queries) has been recently introduced by the algorithm BUC. Iceberg datacubes group aggregates satisfying a selection condition (i.e. SQL having clause). In this paper, we propose an approach for computing a condensed representation of either full and iceberg datacubes. We introduce a novel and sound characterization of datacubes based on dimensional-measurable partitions. Such partitions have an attractive advantage: avoiding sorting techniques which are replaced by a linear product of dimensional-measurable partitions. Moreover, our datacube characterization provides a logical condensed representation interesting when considering the storage explosion problem. We show that our approach turns out to an operational solution more efficient than previous proposals: the algorithm APIC. It enforces a lectic-wise traverse of the dimensional lattice and takes into account the critical problem of memory limitation. Our analytical and experimental performance study shows that APIC and BUC are promising candidates for scalable computation and the best efficiency of APIC.

A Parallel Hierarchical Aggregation Algorithm in High Dimensional Data Warehouse

DROLAP - A Dense-Region Based Approach to On-Line Analytical Processing

Distributed Affinity Propagation Clustering Based on MapReduce

Towards the Building of a Dense-Region-based OLAP System

Data Warehouse Native Feature Based OLAP Querying with Keywords

Requirement-Based Data Cube Schema Design

IMPLEMENT DATA SHARING OVER NETWORK HETEROGENEOUS DATABASES BY DATA DIMENSION REDUCTION METHOD

SCANCHUNK:AN EFFICIENT ALGORITHM FOR HUNTING DENSE REGIONS IN DATA CUBE

An Improved Multi-Dimensional Storage Structure for Data Warehousing

A Clustered Dwarf Structure to Speed Up Queries on Data Cubes

<i>HEDA</i>: Multi-Attribute Unbounded Aggregation over Homomorphically Encrypted Database

Hierarchically Distributed Data Warehouse

Strategies for Complex Data Cube Queries

Privacy-Enhanced And Multifunctional Health Data Aggregation Under Differential Privacy Guarantees

APIC: An Efficient Algorithm for Computing Iceberg Datacubes

Generating Multidimensional Schemata from Relational Aggregation Queries

Paralinear Distance and Its Algorithm for Hierarchical Clustering of High-dimensional Discrete Variables

A Case Study on Parallel HDF5 Dataset Concatenation for High Energy Physics Data Analysis

Clustering Time Series Utilizing A Dimension Hierarchical Decomposition Approach

A Hybrid Parallel Processing Strategy for Large-Scale DEA Computation

An Asynchronous Iteration Approach for Processing on Web Data Warehouse