[EN] Prisoner's Dilemma I - strategies
In today’s lesson, we will replicate the tournament of prisoner’s dilemma strategies you talked about with Roman during the lecture. Each of you will create their own strategy and we will let them compete against each other.
Prisoner’s dilemma rules
Prisoner’s dilemma is a two-player game based on a story about two prisoners: two friends commit crimes together until they are caught. The police do not have enough evidence against them and although they suspect them of more crimes, they can be punished only for the last one. The police thus decides to put each of the friends in a room and offer them a lower punishment for betraying the other.
Each of the prisoners has two possibilities, either to cooperate with the other, or to defect them. In case both of them cooperate, the police do not have enough evidence to convince them from other crimes and they both receive only a small punishment for the last crime. In case one of them cooperates but the other defects, the one that cooperated gets a large punishment and the other is let go free (as a reward for the help). In case both of them defect the other, they are both punished by a bigger punishment, however, it is smaller than the one the cooperating one would receive in case the other defected them (because they both cooperated and helped to solve the case).
Let’s replace the length of the punishment by points (i.e. shorter punishment = more points) and look at the game from the game theoretic point of view. We have a two-player game with the following reward matrix.
In the matrix, the rows are my choices and the columns are the opponent's choices. The numbers are the points I would receive in all possible situations.
When we investigate the matrix, we can determine, how the players will play. Rational player will probably think along these lines:
- I can either cooperate or defect
- If I assume, the opponent will defect me, I should defect them (1>0)
- If I assume, the opponent will cooperate with me, I should defect them again (5>3)
- Therefore, I will defect my opponent.
The opponent follows the same though process and arrives to the same conclusion. Therefore, they both defect each other and receive one point. If the prisoner’s dilemma is played only once, this is the only outcome we can get, assuming both the prisoners are rational.
Iterated prisoner’s dilemma
If the game is played repeatedly (the same players play with each other multiple times and both of them try to maximize their cumulative reward) the situation becomes more interesting. In this case, there is the possibility to cooperate with each other, as any defection can be punished in the next iteration.
However, it is important, that the game is either played indefinitely, or that neither of the players know how many iterations there are. In case they knew the number of iterations, it would be better for them to defect in the last iteration. And because both of them knew, they would defect in the last iteration, it would be better for them to defect in the second to last iteration. And so on.
Iterated prisoner’s dilemma strategies
There are a few simple strategies for iterated prisoners dilemma:
- Always Defect - always defects, regardless of the play of the other player (this strategy is called Always Deceive in the sources)
- Always Cooperate - always cooperates, regardless of the play of the other player
- Random - plays randomly
- Tit-for-tat - cooperates in the first round and plays what the other player played in the next rounds, i.e. it defects if the player defected in the previous round (punishes the defection), and cooperates if the other player cooperated
You can of course make much more different strategies. For inspiration, you can have a look at the Axelrod’s paper he published about the tournament he made in the 80s.
Competition in the lesson
Today’s source codes contain a simple program to execute the tournament of all strategies it find in a specific directory. It expects to find Java
.class files there (compiled Java sources). Each of these files should contain a class inherited from the evolution.prisoner.Strategy class. Make your own (preferably unique) name for the class which implements your strategy. You can should also implement the methods
getAuthorName, which return the name of the strategy and the name of the author of the strategy.
You also have to implement the
reset method, which is called right after your strategy ends a game against a different strategy (before it starts to play with another strategy). The most important method is
nextMove, which returns your next move. Use constants
Move.DECEIVE. The method
reward informs you about the move of the opponent and about the number of points you received for your last move.
At the beginning of the
evolution.prisoner.strategies.PrisonerDilemma file, the directory where the compiled strategies are is set. By default, it is set to the directory IntelliJ IDEA uses to store the compiled file. Change it to the directory your environment uses (e.g. it should be
build/classes/evolution/prisoner/strategies, if you use Eclipse).