A Practical Cross-Datacenter Fault-Tolerance Algorithm in the Cloud Storage System.
Yuxia Cheng,Xinjie Yu,Wenzhi Chen,Rui Chang,Yang Xiang
DOI: https://doi.org/10.1007/s10586-017-0840-5
2017-01-01
Cluster Computing
Abstract:The fault-tolerance property in most cloud storage systems are designed within the scale of a single datacenter. The single datacenter as a whole may be unreachable or crashed due to severe problems, such as broken network links, power supply interruptions, and natural disasters, etc. Therefore, the design of an effective cross-datacenter fault-tolerant storage system is important to protect data security in the cloud. However, building a cross-datacenter fault-tolerant system faces great challenges, such as high latency, low throughput, high costs of bandwidth resources between datacenters. In this paper, we propose a practical cross-datacenter fault-tolerant (CDFT) algorithm in the cloud storage system. Our fault-tolerant algorithm design considers the difficult tradeoffs among fault tolerance, latency, throughput, network and storage costs. We propose the Domain Fault Codes (DFC) and the topology-aware scheduling techniques, which can tolerate the whole datacenter breakdown. We implemented the DFC-CDFT algorithm in a prototype cloud storage system. The experimental results showed that the proposed DFC-CDFT algorithm can effectively recover data blocks from the single datacenter failure while achieves low storage and bandwidth costs.