Demystifying Object-based Big Data Storage Systems

Anindita Sarkar Mondal,Madhupa Sanyal,Ari Kusumastuti,Hrishav Bakul Barua,Kartick Chandra Mondal
2024-06-02
Abstract:Today's era is the digitized era. Managing such generated big data is an important factor for data scientists. Day by day, it increases the demand for big data storage systems. Different organizations are involved in providing storage-related services. They follow the different architectures or storage models for storing big data. In this survey paper, our target is to highlight such storage architectures which provided by different renowned storage service providers. On an architectural basis, we divide the big data storage systems into five parts, Distributed file systems (DFS), Clustered File Systems (CFS), Cloud Storage, Archive Storage, and Object Storage Systems (OSS). Also, we reveal a detailed architectural view of the big data storage systems provided by the different organizations under these parts.
Databases,Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?
This paper is a survey on big data storage systems, with a focus on object-based storage systems. Researchers categorized big data storage systems into five types: Distributed File Systems (DFS), Cluster File Systems (CFS), Cloud Storage, Archival Storage, and Object Storage Systems (OSS). They discussed in detail the architectures of different well-known storage service providers, highlighting the advantages of object storage in cloud environments, such as cost-effectiveness, scalability, and fault tolerance. The paper mentions that although cloud storage, particularly object storage, is becoming increasingly popular, different providers adopt different architectures, which may make it difficult for consumers to choose the appropriate storage services. Therefore, the objective of the paper is to highlight the architectures of various commercial cloud storage systems to enhance consumers' understanding and decision-making skills. In addition, the paper also introduces some key features of object storage, such as space management, security (ensured through credentials and distribution), and the workings of command interpreters. Finally, the paper categorizes and summarizes distributed file systems, cluster file systems, cloud storage, archival storage, and object storage systems. In summary, this paper aims to clarify and compare different types of large-scale data storage systems, with a specific focus on the architectures of cloud storage and object storage systems, in order to assist data scientists and relevant users in better understanding and selecting suitable storage solutions.