Characterizing fault tolerance in genetic programming

Daniel Lombraña González,Francisco Fernández de Vega,Henri Casanova
DOI: https://doi.org/10.1016/j.future.2010.02.006
IF: 7.307
2010-06-01
Future Generation Computer Systems
Abstract:Evolutionary algorithms, including genetic programming (GP), are frequently employed to solve difficult real-life problems, which can require up to days or months of computation. An approach for reducing the time-to-solution is to use parallel computing on distributed platforms. Large platforms such as these are prone to failures, which can even be commonplace events rather than rare occurrences. Thus, fault tolerance and recovery techniques are typically necessary. The aim of this article is to show the inherent ability of parallel GP to tolerate failures in distributed platforms without using any fault-tolerant technique. This ability is quantified via simulation experiments performed using failure traces from real-world distributed platforms, namely, desktop grids, for two well-known problems.
What problem does this paper attempt to address?