Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Nonconvergence to saddle boundary points under perturbed reinforcement learning

Chasparis, Georgios C. ; Shamma, Jeff S. and Rantzer, Anders LU orcid (2015) In International Journal of Game Theory 44(3). p.667-699
Abstract
For several reinforcement learning models in strategic-form games, convergence to action profiles that are not Nash equilibria may occur with positive probability under certain conditions on the payoff function. In this paper, we explore how an alternative reinforcement learning model, where the strategy of each agent is perturbed by a strategy-dependent perturbation (or mutations) function, may exclude convergence to non-Nash pure strategy profiles. This approach extends prior analysis on reinforcement learning in games that addresses the issue of convergence to saddle boundary points. It further provides a framework under which the effect of mutations can be analyzed in the context of reinforcement learning.
Please use this url to cite or link to this publication:
author
; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Learning in games, Reinforcement learning, Replicator dynamics
in
International Journal of Game Theory
volume
44
issue
3
pages
667 - 699
publisher
Springer
external identifiers
  • wos:000358664800007
  • scopus:84938261308
ISSN
1432-1270
DOI
10.1007/s00182-014-0449-3
language
English
LU publication?
yes
id
2cfd7fa5-07d9-40b0-bc26-e362cce96d14 (old id 7767746)
date added to LUP
2016-04-01 10:43:11
date last changed
2023-10-12 12:25:11
@article{2cfd7fa5-07d9-40b0-bc26-e362cce96d14,
  abstract     = {{For several reinforcement learning models in strategic-form games, convergence to action profiles that are not Nash equilibria may occur with positive probability under certain conditions on the payoff function. In this paper, we explore how an alternative reinforcement learning model, where the strategy of each agent is perturbed by a strategy-dependent perturbation (or mutations) function, may exclude convergence to non-Nash pure strategy profiles. This approach extends prior analysis on reinforcement learning in games that addresses the issue of convergence to saddle boundary points. It further provides a framework under which the effect of mutations can be analyzed in the context of reinforcement learning.}},
  author       = {{Chasparis, Georgios C. and Shamma, Jeff S. and Rantzer, Anders}},
  issn         = {{1432-1270}},
  keywords     = {{Learning in games; Reinforcement learning; Replicator dynamics}},
  language     = {{eng}},
  number       = {{3}},
  pages        = {{667--699}},
  publisher    = {{Springer}},
  series       = {{International Journal of Game Theory}},
  title        = {{Nonconvergence to saddle boundary points under perturbed reinforcement learning}},
  url          = {{http://dx.doi.org/10.1007/s00182-014-0449-3}},
  doi          = {{10.1007/s00182-014-0449-3}},
  volume       = {{44}},
  year         = {{2015}},
}