A tradeoff analysis of delayed reconstruction for storage clusters

Qiang Cao,Hongyan Li,Yan Yang,Makoto Takizawa,Naixue xiong
DOI: https://doi.org/10.1145/1815396.1815588
2010-01-01
Abstract:Considering a large part of node failures in a storage clusters cannot actually destroy data in disks and even some failed nodes can soon recover, a policy that deferring a reconstruction until recover during a certain time after a node failure can lessen unnecessary data rebuilding process is absolutely possible and favorable, but it also undoubtedly introduces a certain risk of data loss. In this paper, according to differences in the way setting delay time, we mainly present two algorithms of delaying reconstruction: static and dynamic. A qualitative approach is proposed to analyze the reliability, risk and benefit of these two methods. Numerical results show that under a certain distribution function of repair time for failed nodes, the static method exists an optimal delay time to leverage the risk and benefit. Moreover, the dynamic method has better risk control than static, but its cost is to increase the possibility of launching reconstruction.
What problem does this paper attempt to address?