Multiagent reinforcement learning in the iterated prisoner’s dilemma (1995)

by Tuomas W Sandholm, Robert H Crites
Venue:Biosystems