The Exploitation of Cooperation in Iterated Prisoner's Dilemma

paul j darwen,xin yao
1996-01-01
Abstract:We follow Axelrod 2] in using the genetic algorithm to play Iterated Prisoner's Dilemma. Each member of the population (i.e., each strategy) is evaluated by how it performs against the other members of the current population. This creates a dynamic environment in which the algorithm is optimising to a moving target instead of the usual evaluation against some xed set of strategies, causing an \arms race" of innovation 3]. We conduct two sets of experiments. The rst set investigates what conditions evolve the best strategies. The second set studies the robustness of the strategies thus evolved, that is, are the strategies useful only in the round robin of its population or are they eeective against a wide variety of opponents? Our results indicate that the population has nearly always converged by about 250 generations, by which time the bias in the population has almost always stabilised at 85%. Our results connrm that cooperation almost always becomes the dominant strategy 1, 2]. We can also connrm that seeding the population with expert strategies is best done in small amounts so as to leave the initial population with plenty of genetic diversity 7]. The lack of robustness in strategies produced in the round robin evaluation is demonstrated by some examples of a population of na ve cooperators being exploited by a defect-rst strategy. This causes a sudden but ephemeral decline in the popula-tion's average score, but it recovers when less na ve cooperators emerge and do well against the exploiting strategies. This example of runaway evolution is brought back to reality by a suitable mutation, reminiscent of punctuated equilibria 12]. We nd that a way to reduce such na vity is to make the GA population play against an extra, static, high-quality strategy (not part of the GA population), as well as all the rest of the population. The strategies thus produced perform better against opponents that were included in the round robin (as expected) and, more signiicantly, better against opponents that were not included. That is, robustness is improved.
What problem does this paper attempt to address?