Deep Distributional Temporal Difference Learning for Game Playing

Berglind, Frej; Chen, Jianhua; Sopasakis, Alexandros

Deep Distributional Temporal Difference Learning for Game Playing

Mark

Berglind, Frej ; Chen, Jianhua and Sopasakis, Alexandros ^LU

(2021) Industrial Part of 25th International Symposium on Methodologies for Intelligent Systems, ISMIS 2020 In Studies in Computational Intelligence 949. p.192-206

Abstract: We compare classic scalar temporal difference learning with three new distributional algorithms for playing the game of 5-in-a-row using deep neural networks: distributional temporal difference learning with constant learning rate, and two distributional temporal difference algorithms with adaptive learning rate. All these algorithms are applicable to any two-player deterministic zero sum game and can probably be successfully generalized to other settings. All algorithms in our study performed well and developed strong strategies. The algorithms implementing the adaptive methods learned more quickly in the beginning, but in the long run, they were outperformed by the algorithms using constant learning rate which, without any prior... (More); We compare classic scalar temporal difference learning with three new distributional algorithms for playing the game of 5-in-a-row using deep neural networks: distributional temporal difference learning with constant learning rate, and two distributional temporal difference algorithms with adaptive learning rate. All these algorithms are applicable to any two-player deterministic zero sum game and can probably be successfully generalized to other settings. All algorithms in our study performed well and developed strong strategies. The algorithms implementing the adaptive methods learned more quickly in the beginning, but in the long run, they were outperformed by the algorithms using constant learning rate which, without any prior knowledge, learned to play the game at a very high level after 200 000 games of self play.
(Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/2a876ade-a7e8-4b07-a277-d14fe74aa1cc

author

Berglind, Frej ; Chen, Jianhua and Sopasakis, Alexandros ^LU

organization

publishing date

2021

type

Chapter in Book/Report/Conference proceeding

publication status

published

subject

Computer Sciences

host publication

Intelligent Systems in Industrial Applications

series title

Studies in Computational Intelligence

editor

Stettinger, Martin ; Leitner, Gerhard ; Felfernig, Alexander and Ras, Zbigniew W.

volume

949

pages

15 pages

publisher

Springer Science and Business Media B.V.

conference name

Industrial Part of 25th International Symposium on Methodologies for Intelligent Systems, ISMIS 2020

conference location

Graz, Austria

conference dates

2020-09-23 - 2020-09-25

external identifiers

scopus:85101802245

ISSN

1860-9503

1860-949X

ISBN

9783030671471

DOI

10.1007/978-3-030-67148-8_14

language

English

LU publication?

yes

id

2a876ade-a7e8-4b07-a277-d14fe74aa1cc

date added to LUP

2021-03-14 09:52:46

date last changed

2025-04-04 15:13:06

@inproceedings{2a876ade-a7e8-4b07-a277-d14fe74aa1cc,
  abstract     = {{<p>We compare classic scalar temporal difference learning with three new distributional algorithms for playing the game of 5-in-a-row using deep neural networks: distributional temporal difference learning with constant learning rate, and two distributional temporal difference algorithms with adaptive learning rate. All these algorithms are applicable to any two-player deterministic zero sum game and can probably be successfully generalized to other settings. All algorithms in our study performed well and developed strong strategies. The algorithms implementing the adaptive methods learned more quickly in the beginning, but in the long run, they were outperformed by the algorithms using constant learning rate which, without any prior knowledge, learned to play the game at a very high level after 200 000 games of self play.</p>}},
  author       = {{Berglind, Frej and Chen, Jianhua and Sopasakis, Alexandros}},
  booktitle    = {{Intelligent Systems in Industrial Applications}},
  editor       = {{Stettinger, Martin and Leitner, Gerhard and Felfernig, Alexander and Ras, Zbigniew W.}},
  isbn         = {{9783030671471}},
  issn         = {{1860-9503}},
  language     = {{eng}},
  pages        = {{192--206}},
  publisher    = {{Springer Science and Business Media B.V.}},
  series       = {{Studies in Computational Intelligence}},
  title        = {{Deep Distributional Temporal Difference Learning for Game Playing}},
  url          = {{http://dx.doi.org/10.1007/978-3-030-67148-8_14}},
  doi          = {{10.1007/978-3-030-67148-8_14}},
  volume       = {{949}},
  year         = {{2021}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Deep Distributional Temporal Difference Learning for Game Playing