Dynamic Table: A Layered and Configurable Storage Structure in the Cloud.

Xu Cheng,Biping Meng,Yuxin Chen,Peng Zhao,Hongyan Li,Tengjiao Wang,Dongqing Yang
DOI: https://doi.org/10.1007/978-3-642-33050-6_21
2012-01-01
Abstract:Big data bring us not only constantly growing data volume, dynamic and elastic storage demands, diversified data structures, but also different data features. Apart from the traditional dense data, more and more "sparse" data emerged and account for the majority of the massive data. How to adapt to the characteristics of the sparse data without losing sight of the traits of the dense data is a challenge. To meet the differentiated storage demands and give a proper way to express the semantic of absent values, we proposed a 3-layered storage structure named "Dynamic Table" to represent the incomplete data. Our approach deliberates on the distributed storage requirements in the cloud and aims to support a hybrid row and column layout, which allows users to mix-and-match the two kinds of physical storage formats on demand. In addition, the original semantic of absent values is divided into two parts with distinct treatments. Specifically a four-valued logic is introduced. Experiments on synthetic and real-world data sets demonstrate that our approach combines the advantages of columnar storage and the merits of row-oriented store. The distinguished semantic of absent values are necessary to describe the missing values in sparse data set. © 2012 Springer-Verlag.
What problem does this paper attempt to address?