Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Deep Distributional Temporal Difference Learning for Game Playing

Berglind, Frej ; Chen, Jianhua and Sopasakis, Alexandros LU (2021) Industrial Part of 25th International Symposium on Methodologies for Intelligent Systems, ISMIS 2020 In Studies in Computational Intelligence 949. p.192-206
Abstract

We compare classic scalar temporal difference learning with three new distributional algorithms for playing the game of 5-in-a-row using deep neural networks: distributional temporal difference learning with constant learning rate, and two distributional temporal difference algorithms with adaptive learning rate. All these algorithms are applicable to any two-player deterministic zero sum game and can probably be successfully generalized to other settings. All algorithms in our study performed well and developed strong strategies. The algorithms implementing the adaptive methods learned more quickly in the beginning, but in the long run, they were outperformed by the algorithms using constant learning rate which, without any prior... (More)

We compare classic scalar temporal difference learning with three new distributional algorithms for playing the game of 5-in-a-row using deep neural networks: distributional temporal difference learning with constant learning rate, and two distributional temporal difference algorithms with adaptive learning rate. All these algorithms are applicable to any two-player deterministic zero sum game and can probably be successfully generalized to other settings. All algorithms in our study performed well and developed strong strategies. The algorithms implementing the adaptive methods learned more quickly in the beginning, but in the long run, they were outperformed by the algorithms using constant learning rate which, without any prior knowledge, learned to play the game at a very high level after 200 000 games of self play.

(Less)
Please use this url to cite or link to this publication:
author
; and
organization
publishing date
type
Chapter in Book/Report/Conference proceeding
publication status
published
subject
host publication
Intelligent Systems in Industrial Applications
series title
Studies in Computational Intelligence
editor
Stettinger, Martin ; Leitner, Gerhard ; Felfernig, Alexander and Ras, Zbigniew W.
volume
949
pages
15 pages
publisher
Springer Science and Business Media B.V.
conference name
Industrial Part of 25th International Symposium on Methodologies for Intelligent Systems, ISMIS 2020
conference location
Graz, Austria
conference dates
2020-09-23 - 2020-09-25
external identifiers
  • scopus:85101802245
ISSN
1860-9503
1860-949X
ISBN
9783030671471
DOI
10.1007/978-3-030-67148-8_14
language
English
LU publication?
yes
id
2a876ade-a7e8-4b07-a277-d14fe74aa1cc
date added to LUP
2021-03-14 09:52:46
date last changed
2024-05-30 08:05:05
@inproceedings{2a876ade-a7e8-4b07-a277-d14fe74aa1cc,
  abstract     = {{<p>We compare classic scalar temporal difference learning with three new distributional algorithms for playing the game of 5-in-a-row using deep neural networks: distributional temporal difference learning with constant learning rate, and two distributional temporal difference algorithms with adaptive learning rate. All these algorithms are applicable to any two-player deterministic zero sum game and can probably be successfully generalized to other settings. All algorithms in our study performed well and developed strong strategies. The algorithms implementing the adaptive methods learned more quickly in the beginning, but in the long run, they were outperformed by the algorithms using constant learning rate which, without any prior knowledge, learned to play the game at a very high level after 200 000 games of self play.</p>}},
  author       = {{Berglind, Frej and Chen, Jianhua and Sopasakis, Alexandros}},
  booktitle    = {{Intelligent Systems in Industrial Applications}},
  editor       = {{Stettinger, Martin and Leitner, Gerhard and Felfernig, Alexander and Ras, Zbigniew W.}},
  isbn         = {{9783030671471}},
  issn         = {{1860-9503}},
  language     = {{eng}},
  pages        = {{192--206}},
  publisher    = {{Springer Science and Business Media B.V.}},
  series       = {{Studies in Computational Intelligence}},
  title        = {{Deep Distributional Temporal Difference Learning for Game Playing}},
  url          = {{http://dx.doi.org/10.1007/978-3-030-67148-8_14}},
  doi          = {{10.1007/978-3-030-67148-8_14}},
  volume       = {{949}},
  year         = {{2021}},
}