[EN] Prisoner's Dilemma II - simulation and evolution
Last time, all of you created a strategy for the prisoner’s dilemma, in this lesson we will experiment with your strategies. We will try to simulate the life of strategies, in a similar way Axelrod did, and we will also try to evolve a successful strategy against a group of strategies.
Today, you will need your strategies from last time and some files with settings. You can download all these files in a single archive here. (The archive also contains this year’s versions of the files
input-sumulation.txt, which you should put into the
Simulation of the life of strategies
Imagine, we have a population of individuals, who repeatedly play iterated prisoners dilemma against each other, and based on the number of points they receive, they are able to mate and produce offspring who survive to the next generation.
We can observe some interesting phenomena, in such a simulation. For example, is there a strategy, which will survive in all cases? Is there a group of strategies, which cannot be defeated by any other (new) strategy in the population?
Let us take, for example, a group of individuals who play the Always Cooperate (AC) strategy. What happens, when a new individual arrives to the population and plays the Always Deceive (AD) strategy? All the AC individuals get 3 points fir each round played against another AC individual, but the AD individuals get 5 points for each round against an AC individual. There are a lot of AC individuals and only one AD individual, thus the AD individual will get more points than the AC individuals and will create more offspring in the population and the number of AD individuals will grow. Will the number of AD individuals grow until they fill the whole population or is there a point, where it will stop? And what would happen, if instead of AC we had Tit-for-tat individuals in the initial population?
The simulation of the life of strategies is implemented in the
evolution.prisoner.simulation. It uses the framework of evolutionary algorithms, but there are no operators. Each individual is an index to the table of all strategies. The number tells, which strategy the individual will play. Each individual meets 10 (can be set in
ga-prisoner-simulation.properties) other individuals during each generation and they play the iterated prisoner’s dilemma. Their fitness is than the sum of all the points they got in the 10 games. Finally, roulette wheel selection is performed and it creates the new population. The program outputs when a strategy dies out and in the end prints the number of individuals of each surviving strategy. The number of individuals of each strategy in the initial population can be set in the
Evolution of a strategy
It may be interesting to know, whether there is strategy, which wins against a group of strategies. We will not answer this question, but we will try to find such a strategy by using the evolutionary algorithms.
In the package
evolution.prisoner.evolution we have some source codes, which try to evolve such a strategy. It is quite difficult to come up with a suitable encoding of the individuals. In our case, each individual remembers the last three moves of the opponent and of itself. Based on these three moves, it decides, whether to cooperate or to deceive in the next turn. Thus, the individuals is a binary string of length 64 (2^6) and the techniques from the first lesson are used to optimize it. The fitness is the number of points the strategy obtains in a tournament of all strategies against which it is evolved.
Of course, there are different possibilities how to evolve a strategy. The simplest one can be to use evolutionary algorithm to tune some of the parameters of any strategy (e.g. the number of times I tolerate deception, or the number of times I deceive the opponent after they deceive me).
The group of strategies against which the new strategy is evolved can be set in the