Abstract
How can players reach a Nash equilibrium? I offer one possible explanation in terms of a low-rationality learning method called probe and adjust by proving that it converges to strict Nash equilibria in an important class of games. This demonstrates that decidedly limited learning methods can support Nash equilibrium play.
Similar content being viewed by others
Notes
We call it actions instead of strategies since, strictly speaking, we consider infinitely repeated games where strategies specify choices at each information set of a player.
This is determined entirely by the state (a, a p) since a is the current state and the action profile of the previous state can be reconstructed from a and a p.
The notation \(\xrightarrow{O(\varepsilon^k)}\) indicates the order of the transition probability in question.
On a better reply path, u i (a k+1) > u i (a k) for exactly one player.
This follows since the stochastic potential of (A, A) and (B, B) is the same under the rule of Marden et al.
References
Bala V, Goyal S (2000) A noncooperative model of network formation. Econometrica 68:1181–1129
Fudenberg D, Levine DK (1998) The theory of learning in games. MIT Press, Cambridge, MA
Huttegger SM, Skyrms B (2012) Emergence of a signaling network with “probe and adjust.” In: Calcott B, Joyce R, Sterelny K (eds) Signaling, commitment, and emotion. MIT Press, Cambridge, MA
Huttegger SM, Skyrms B, Zollman KJS (2013) Probe and adjust in information transfer games. Erkenntnis (forthcoming)
Lewis D (1969) Convention. A philosophical study. Harvard University Press, Cambridge, MA
Marden JR, Young HP, Arslan G, Shamma JS (2009) Payoff-based dynamics for multiplayer weakly acyclic games. Siam J Control Optim 48:373–396
Nowak MA, Sigmund K (1993) A strategy of win-stay, loose-shift that outperforms tit-for-tat in the Prisoner’s Dilemma game. Nat Biotechnol 364:56–58
Roth A, Erev I (1995) Learning in extensive form games: experimental data and simple dynamic models in the intermediate term. Games Econ Behav 8:164–212
Simon HA (1955) A behavioral model of rational choice. Q J Econ 69:99–118
Skyrms B (2010) Signals: evolution, learning, and information. Oxford University Press, Oxford
Young HP (1993) The evolution of conventions. Econometrica 61:57–83
Young HP (1998) Individual strategy and social structure. An evolutionary theory of institutions. Princeton University Press, Princeton
Young HP (2004) Strategic learning and its limits. Oxford University Press, Oxford
Acknowledgments
I would like to thank the KLI for their hospitality and for hosting the workshop. This material is based upon work supported by the National Science Foundation under Grant No. EF 1038456. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Huttegger, S.M. Probe and Adjust. Biol Theory 8, 195–200 (2013). https://doi.org/10.1007/s13752-013-0114-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13752-013-0114-2