Capturing Hadoop Storage Big Data Layer Meta-Concepts

Allae Erraissi,Abdessamad Belangour
DOI: https://doi.org/10.1007/978-3-030-11928-7_37
2019-01-01
Abstract:Nowadays, producing streams of data is not helpful if you cannot store them somewhere. Applications, software, and objects generate huge masses of data, which need to be collected, stored, and made available for analysis. Moreover, these data are very valuable and need to be preserved. That is why Big Data has attracted global interest from all leaders of information technology and new ways of storing information have emerged and flourished. Accordingly, while proceeding our analysis on this subject, we note that in terms of Big Data architecture, the storage layer is very useful and is essential for the proper functioning of any Big Data system. In fact, there are two types of storage at this layer: Hadoop distributed file system (HDFS) and NoSQL databases. We relied on previous works in which we identified key storage concepts through comparative studies of main big data distributions. The storage layer is located directly above Data Sources and Data ingestion layers for which we already proposed a meta-model. Thus, in this paper, we applied techniques related to Model Driven Engineering ‘MDE’ to provide a universal Meta-modeling for the storage layer at the level of a Big Data system.
What problem does this paper attempt to address?