An Array-based Algorithm for Data Cube Computation with PC Cluster

李盛恩,李翠平,王珊,杜小勇
DOI: https://doi.org/10.3969/j.issn.1000-7180.2003.08.001
2003-01-01
Abstract:The computation of data cube is a very expensive op-eration because a lot of data have to be accessed.We investi-gate the approach of using low cost PC cluster to compute data cube.In our approach,multidimensional array is used to store data.We partition multidimensional array into fragments and distribute them among machines in cluster.Fragments are com-pressed to save storage space,access time and incremental maintenance time.We propose a novel method to organize pipeline and an idea of creating fragment index file.The cost of outer sort and the amount of accessing disk are dramatically re-duced.The experiment results show the algorithm is of scalability.
What problem does this paper attempt to address?