Abstract:With the increasing popularity of a large number of Internet-based services and a large number of services hosted on cloud platforms, a more powerful back-end storage system is needed to support these services. At present, it is very difficult or impossible to implement a distributed storage to meet all the above assumptions. Therefore, the focus of research is to limit different characteristics to design different distributed storage solutions to meet different usage scenarios. Economic big data should have the basic requirements of high storage efficiency and fast retrieval speed. The large number of small files and the diversity of file types make the storage and retrieval of economic big data face severe challenges. This paper is oriented to the application requirements of cross-modal analysis of economic big data. According to the source and characteristics of economic big data, the data types are analyzed and the database storage architecture and data storage structure of economic big data are designed. Taking into account the spatial, temporal, and semantic characteristics of economic big data, this paper proposes a unified coding method based on the spatiotemporal data multilevel division strategy combined with Geohash and Hilbert and spatiotemporal semantic constraints. A prototype system was constructed based on Mongo DB, and the performance of the multilevel partition algorithm proposed in this paper was verified by the prototype system based on the realization of data storage management functions. The Wiener distributed memory based on the principle of Wiener filter is used to store the workload of each workload distributed storage window in a distributed manner. For distributed storage workloads, this article adopts specific types of workloads. According to its periodicity, the workload is divided into distributed storage windows of specific duration. At the beginning of each distributed storage window, distributed storage is distributed to the next distributed storage window. Experiments and tests have verified the distributed storage strategy proposed in this article, which proves that the Wiener distributed storage solution can save platform resources and configuration costs while ensuring Service Level Agreement (SLA).

Automating distributed tiered storage management in cluster computing

A Request Skew Aware Heterogeneous Distributed Storage System Based on Cassandra

An Optimized Learning-Based Directory Placement Policy with Two-Rounds Selection in Distributed File Systems

Adaptive Cache Policy Scheduling for Big Data Applications on Distributed Tiered Storage System.

Intelligent Data Migration Policies in a Write-Optimized Copy-on-Write Tiered Storage Stack

Rethinking Storage Management for Data Processing Pipelines in Cloud Data Centers

Accelerating Big Data Applications on Tiered Storage System with Various Eviction Policies.

Efficient Hierarchical Storage Management Framework Empowered by Reinforcement Learning

AutoTiering: Automatic Data Placement Manager in Multi-Tier All-Flash Datacenter

Shard Level Transaction Based Cluster Management for Online Distributed Storage

Enhancing Hadoop distributed storage efficiency using multi-agent systems

A Real-Time Scheduling Strategy Based on Processing Framework of Hadoop

Big Data Analytics on Traditional HPC Infrastructure Using Two-Level Storage

Dynamic Data Storage and Management Strategies for Distributed File System

Triple-H: A Hybrid Approach to Accelerate HDFS on HPC Clusters with Heterogeneous Storage Architecture

Adapting The Secretary Hiring Problem for Optimal Hot-Cold Tier Placement under Top-$K$ Workloads

TS-Hadoop: Handling Access Skew in MapReduce by Using Tiered Storage Infrastructure

Designing a Power-Aware Replication Strategy for Storage Clusters.

Distributed Storage Strategy and Visual Analysis for Economic Big Data

Hierarchical Storage for Massive Social Network Data Based on Improved Decision Tree

Distributed Wear levelling of Flash Memories