CRMS: A centralized replication management scheme for cloud storage system

Kangxian Huang,Dagang Li,Yongyue Sun
DOI: https://doi.org/10.1109/ICCChina.2014.7008299
2014-01-01
Abstract:As distributed storage clusters have been used more and more widely in recent years, data replication management, which is the key to data availability, has become a hot research topic. In storage clusters, internal network bandwidth is usually a scarce resource. Misplaced replicas may take up too much network bandwidth and greatly deteriorate the overall performance of the cluster. Aiming to reduce the internal network traffic and to improve load balancing of distributed storage clusters, we developed a centralized replication management scheme referred to as CRMS. A model is proposed to capture the relationships of block access probability, replica location and network traffic. Based on this model, the replica placement problem is formulated as a 0-1 programming optimization problem. Based on the feasible solution to this problem, a heuristic is proposed to process the replica adjustments step by step. Our CRMS is evaluated by using the access history from a distributed storage cluster of Xunlei Inc., one of the leading Internet companies in China. The experimental results show that CRMS can greatly reduce the amount of internal network bandwidth consumption, while keeping the cluster's storage usage in balance.
What problem does this paper attempt to address?