Abstract
In this paper we revise Reinforcement Learning and adaptiveness in Multi-Agent Systems from an Evolutionary Game Theoretic perspective. More precisely we show there is a triangular relation between the fields of Multi-Agent Systems, Reinforcement Learning and Evolutionary Game Theory. We illustrate how these new insights can contribute to a better understanding of learning in MAS and to new improved learning algorithms. All three fields are introduced in a self-contained manner. Each relation is discussed in detail with the necessary background information to understand it, along with major references to relevant work.
Similar content being viewed by others
REFERENCES
Bazzan A. L. C. and Franziska Klugl: 2003, ‘Learning to Behave Socially and Avoid the Braess Paradox in a Commuting Scenario’, in Proceedings of the First International Workshop on Evolutionary Game Theory for Learning in MAS, Melbourne Australia.
Bazzan A. L. C.: 1997, A Game-Theoretic Approach to Coordination of Traffic Signal Agents, Ph. D. thesis, University of Karlsruhe.
Börgers, T. and R. Sarin: 1997, ‘Learning through Reinforcement and Replicator Dynamics’, Journal of Economic Theory 77(1).
Braess D.: 1968, ‘Uber ein paradoxon aus der verkehrsplanung’, Unternehmensforschung 12, 258.
Bush, R. R. and F. Mosteller, F.: 1955, Stochastic Models for Learning, Wiley, New York.
Claus, C. and C. Boutilier: 1998, ‘The Dynamics of Reinforcement Learning in Cooperative Multi-Agent Systems, in Proceedings of the 15th International Conference on Artificial Intelligence, pp. 746–752.
Ghosh, A. and S. Sen: 2003, ‘Learning TOMs: Convergence to Non-Myopic Equilibria’, in Proceedings of the First International Workshop on Evolutionary Game Theory for Learning in MAS, Melbourne, Australia.
Gintis, C. M.: 2000, Game Theory Evolving, University Press, Princeton.
Hirsch, M. W. and S. Smale: 1974, Differential Equations, Dynamical Systems and Linear Algebra, Academic Press, Inc.
Hofbauer, J. and K. Sigmund: 1998, Evolutionary Games and Population Dynamics, Cambridge University Press.
Hu, J. and M. P. Wellman: 1998, Multiagent Reinforcement Learning in Stochastic Games, Cambridge University Press.
Jafari, C., A. Greenwald, D. Gondek, and G. Ercal: 2001, ‘On No-Regret Learning, Fictitious Play, and Nash Equilibrium’, in Proceedings of the Eighteenth International Conference on Machine Learning, pp. 223–226.
Kaelbling, L. P., M. L. Littman, and A. W. Moore: 1996, ‘Reinforcement Learning: A Survey’, Journal of Artificial Intelligence Research.
Littman, M. L.: 1994, ‘Markov Games as a Framework for Multi-Agent Reinforcement Learning’, Proceedings of the Eleventh International Conference on Machine Learning, pp. 157–163.
Loch, J. and S. Singh: 1998, ‘Using Eligibility Traces to Find the Best Memoryless Policy in a Partially Observable Markov Process’, Proceedings of the Fifteenth International Conference on Machine Learning, San Francisco.
Luck, M., P. McBurney, and C. Preist: 2003, ‘A Roadmap for Agent Based Computing’, AgentLink, Network of Excellence.
Maynard-Smith, J.: 1982, Evolution and the Theory of Games, Cambridge University Press.
Maynard Smith, J. and G. R. Price: 1973, ‘The Logic of Animal Conflict’, Nature 146, 15–18.
Narendra, K. and M. Thathachar: 1989, Learning Automata: An Introduction, Prentice-Hall.
Nowé, A., J. Parent, and K. Verbeeck: 2001, ‘Social Agents Playing a Periodical Policy’, in Proceedings of the 12th European Conference on Machine Learning, pp. 382–393.
Nowé A. and K. Verbeeck: 1999, ‘Distributed Reinforcement learning, Loadbased Routing a Case Study’, Notes of the Neural, Symbolic and Reinforcement Methods for Sequence Learning Workshop at ijcai99, Stockholm, Sweden.
von Neumann, J. and O. Morgenstern: 1944, Theory of Games and Economic Behaviour, Princeton University Press, Princeton.
Osborne, J. O. and A. Rubinstein: 1994, A Course in Game Theory,MIT Press, Cambridge, MA.
Pendrith, M. D. and M. J. McGarity: 1998, ‘An Analysis of Direct Reinforcement Learning in Non-Markovian Domains’, in Proceedings of the Fifteenth International Conference on Machine Learning, San Francisco.
Perkins, T. J. and M. D. Pendrith: 2002, ‘On the Existence of Fixed Points for Q-Learning and Sarsa in Partially Observable Domains’, in Proceedings of the International Conference on Machine Learning (ICML02).
Redondo, F. V.: 2001, Game Theory and Economics, Cambridge University Press.
Robocup project: 2003, ‘The Official Robocup Website at www.robocup.org, Robocup.
Samuelson, L.: 1997, Evolutionary Games and Equilibrium Selection, MIT Press, Cambridge, MA.
Schneider, T. D.: 2000, ‘Evolution of Biological Information’, Journal of Nucleic Acids Research 28, 2794–2799.
Stauffer, D.: 1999, Life, Love and Death: Models of Biological Reproduction and Aging, Institute for Theoretical Physics, Köln, Euroland.
Stone P.: 2000, Layered Learning in Multi-Agent Systems, MIT Press, Cambridge, MA.
Sutton, R. S. and A. G. Barto: 1998, Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA.
Tsitsiklis, J. N.: 1993, ‘Asynchronous Stochastic Approximation and q-Learning’, Internal Report from the Laboratory for Information and Decision Systems and the Operation Research Center, MIT Press, Cambridge, MA.
Tuyls, K., T. Lenaerts, K. Verbeeck, S. Maes, and B. Manderick: 2002, ‘Towards a Relation between Learning Agents and Evolutionary Dynamics’, in Proceedings of the Belgium-Netherlands Artificial Intelligence Conference 2002 (BNAIC), KU Leuven, Belgium.
Tuyls, K., K. Verbeeck, and S. Maes: 2003a, ‘On a Dynamical Analysis of Reinforcement Learning in Games: Emergence of Occam’s Razor, Lecture Notes in Artificial Intelligence, Multi-Agent Systems and Applications III, Lecture Notes in AI 2691, (Central and Eastern European conference on Multi-Agent Systems 2003), Prague, 16–18 June 2003, Czech Republic.
Tuyls, K., K. Verbeeck, and T. Lenaerts, T.: 2003b, ‘A Selection-Mutation Model for Q-Learning in Multi-Agent Systems’, in The ACM International Conference Proceedings Series, Autonomous Agents and Multi-Agent Systems 2003, Melbourne, 14–18 July 2003, Australia.
Tuyls, K., D. Heytens, A. Nowe, and B. Manderick: 2003c, ‘Extended Replicator Dynamics as a Key to Reinforcement Learning in Multi-Agent Systems’, Proceedings of the European Conference on Machine Learning’03, Lecture Notes in Artificial Intelligence, Cavtat-Dubrovnik, 22–26 September 2003, Croatia.
Weibull, J. W.: 1996, Evolutionary Game Theory, MIT Press, Cambridge, MA.
Weibull, J. W.: 1998, ‘What we have Learned from Evolutionary Game Theory so Far?’, Stockholm School of Economics and I.U.I., May 7, 1998.
Weiss, G.: 1999, in Gerard Weiss (ed.), Multiagent Systems. A Modern Approach to Distributed Artificial Intelligence, MIT Press, Cambridge, MA.
Wooldridge, M.: 2002, An Introduction to MultiAgent Systems, John Wiley & Sons, Chichester, England.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Tuyls, K., Nowe, A., Lenaerts, T. et al. An Evolutionary Game Theoretic Perspective on Learning in Multi-Agent Systems. Synthese 139, 297–330 (2004). https://doi.org/10.1023/B:SYNT.0000024908.89191.f1
Issue Date:
DOI: https://doi.org/10.1023/B:SYNT.0000024908.89191.f1