Deep Distributional Temporal Difference Learning for Game Playing
(2021) Industrial Part of 25th International Symposium on Methodologies for Intelligent Systems, ISMIS 2020 In Studies in Computational Intelligence 949. p.192-206- Abstract
We compare classic scalar temporal difference learning with three new distributional algorithms for playing the game of 5-in-a-row using deep neural networks: distributional temporal difference learning with constant learning rate, and two distributional temporal difference algorithms with adaptive learning rate. All these algorithms are applicable to any two-player deterministic zero sum game and can probably be successfully generalized to other settings. All algorithms in our study performed well and developed strong strategies. The algorithms implementing the adaptive methods learned more quickly in the beginning, but in the long run, they were outperformed by the algorithms using constant learning rate which, without any prior... (More)
We compare classic scalar temporal difference learning with three new distributional algorithms for playing the game of 5-in-a-row using deep neural networks: distributional temporal difference learning with constant learning rate, and two distributional temporal difference algorithms with adaptive learning rate. All these algorithms are applicable to any two-player deterministic zero sum game and can probably be successfully generalized to other settings. All algorithms in our study performed well and developed strong strategies. The algorithms implementing the adaptive methods learned more quickly in the beginning, but in the long run, they were outperformed by the algorithms using constant learning rate which, without any prior knowledge, learned to play the game at a very high level after 200 000 games of self play.
(Less)
- author
- Berglind, Frej
; Chen, Jianhua
and Sopasakis, Alexandros
LU
- organization
- publishing date
- 2021
- type
- Chapter in Book/Report/Conference proceeding
- publication status
- published
- subject
- host publication
- Intelligent Systems in Industrial Applications
- series title
- Studies in Computational Intelligence
- editor
- Stettinger, Martin ; Leitner, Gerhard ; Felfernig, Alexander and Ras, Zbigniew W.
- volume
- 949
- pages
- 15 pages
- publisher
- Springer Science and Business Media B.V.
- conference name
- Industrial Part of 25th International Symposium on Methodologies for Intelligent Systems, ISMIS 2020
- conference location
- Graz, Austria
- conference dates
- 2020-09-23 - 2020-09-25
- external identifiers
-
- scopus:85101802245
- ISSN
- 1860-9503
- 1860-949X
- ISBN
- 9783030671471
- DOI
- 10.1007/978-3-030-67148-8_14
- language
- English
- LU publication?
- yes
- id
- 2a876ade-a7e8-4b07-a277-d14fe74aa1cc
- date added to LUP
- 2021-03-14 09:52:46
- date last changed
- 2025-03-14 03:07:44
@inproceedings{2a876ade-a7e8-4b07-a277-d14fe74aa1cc, abstract = {{<p>We compare classic scalar temporal difference learning with three new distributional algorithms for playing the game of 5-in-a-row using deep neural networks: distributional temporal difference learning with constant learning rate, and two distributional temporal difference algorithms with adaptive learning rate. All these algorithms are applicable to any two-player deterministic zero sum game and can probably be successfully generalized to other settings. All algorithms in our study performed well and developed strong strategies. The algorithms implementing the adaptive methods learned more quickly in the beginning, but in the long run, they were outperformed by the algorithms using constant learning rate which, without any prior knowledge, learned to play the game at a very high level after 200 000 games of self play.</p>}}, author = {{Berglind, Frej and Chen, Jianhua and Sopasakis, Alexandros}}, booktitle = {{Intelligent Systems in Industrial Applications}}, editor = {{Stettinger, Martin and Leitner, Gerhard and Felfernig, Alexander and Ras, Zbigniew W.}}, isbn = {{9783030671471}}, issn = {{1860-9503}}, language = {{eng}}, pages = {{192--206}}, publisher = {{Springer Science and Business Media B.V.}}, series = {{Studies in Computational Intelligence}}, title = {{Deep Distributional Temporal Difference Learning for Game Playing}}, url = {{http://dx.doi.org/10.1007/978-3-030-67148-8_14}}, doi = {{10.1007/978-3-030-67148-8_14}}, volume = {{949}}, year = {{2021}}, }