Abstract:With the development of cloud computing and internet, e-Commerce, e-Business and corporate world revenue are increasing with high rate. These areas require scalable and consistent databases. NoSQL databases such as HBase has been proven to scalability and well performance on cloud computing platforms. However, the inevitable special data with few increment and frequent access leads to hotspot data and unbalanced accessing distribution between data storage servers. Due to their properties, these data often cannot be stored in multiple tables. Some storage nodes become the bottleneck of the distributed storage system, therefore, it becomes difficult to improve the performance by increasing the number of nodes which severely limits the scalability of the storage system. In order to make the performance of the cluster increases with the size of the cluster simultaneously, we devise a new distributed database storage framework to solve those issues mentioned above by changing the storage and read-write mode of the hotspot data. This structure guarantees that the hotspot data will not aggregate in the same storage node, as it guarantees that the data is not too hot in a single storage node. We implement the scalable database based on Apache HBase, which achieve almost double performance of throughput considering heavy read-write pressure situation only with double reading substites. Besides, heavy load node owing to hotspot data will no longer present in the new distributed database.

Heterogeneous Replicas for Multi-dimensional Data Management

Optimize Multidimensional Arrays Queries with Heterogeneous Replica Method

Heterogeneous Replica for Query on Cassandra

IMPLEMENT DATA SHARING OVER NETWORK HETEROGENEOUS DATABASES BY DATA DIMENSION REDUCTION METHOD

A Request Skew Aware Heterogeneous Distributed Storage System Based on Cassandra

A Combination Replication Strategy for Data-Intensive Services in Distributed Geographic Information System

Coexistence of Multiple Partition Plan Based Physical Database Design.

An Experimental Evaluation of Performance of A Hadoop Cluster on Replica Management

Optimizing Data Partition for Scaling out Nosql Cluster

An Optimized Replica Distribution Method in Cloud Storage System

Dynamic Data Replication in Distributed Systems

hStorage-DB: Heterogeneity-aware Data Management to Exploit the Full Capability of Hybrid Storage Systems

Design of A More Scalable Database System

CRMS: A centralized replication management scheme for cloud storage system

Dynamic Data Storage and Management Strategies for Distributed File System

PRS: A Pattern-Directed Replication Scheme for Heterogeneous Object-Based Storage

An Efficient Replicated System for the Metadata of HDFS

A dynamic optimal replication strategy in data grid environment

A Practice Of Tpc-Ds Multidimensional Implementation On Nosql Database Systems

Replica-aware Data Recovery Performance Improvement for Hadoop System with NVM

A Dynamic Replication Management Strategy in Distributed GIS