Repairing Multiple Failures with Coordinated and Adaptive Regenerating Codes

Anne-Marie Kermarrec,Nicolas Le Scouarnec,Gilles Straub
DOI: https://doi.org/10.1109/isnetcod.2011.5978920
2011-07-01
Abstract:Erasure correcting codes are widely used to ensure data persistence in distributed storage systems. This paper addresses the simultaneous repair of multiple failure in such codes. We go beyond existing work (i.e., regenerating codes by Dimakis et al.) and propose coordinated regenerating codes allowing devices to coordinate during simultaneous repairs thus further reducing the costs. We define optimal coordinated regenerating codes outperforming existing codes for simultaneous repairs with respect to both storage and repair costs. We prove that deliberately delaying repairs does not bring additional gains (i.e., regenerating codes are optimal as long as each failure can be repaired before a second one occurs). Finally, we propose adaptive regenerating codes that self-adapt to the system state and prove they are optimal.
What problem does this paper attempt to address?