The Missing Link Between Memory and Reinforcement Learning

Balkenius, Christian; Tjøstheim, Trond A.; Johansson, Birger; Wallin, Annika; Gärdenfors, Peter

The Missing Link Between Memory and Reinforcement Learning

Mark

Balkenius, Christian ^LU

; Tjøstheim, Trond A. ^LU ; Johansson, Birger ^LU

; Wallin, Annika ^LU

and Gärdenfors, Peter ^LU (2020) In Frontiers in Psychology 11.

Abstract: Reinforcement learning systems usually assume that a value function is defined over all states (or state-action pairs) that can immediately give the value of a particular state or action. These values are used by a selection mechanism to decide which action to take. In contrast, when humans and animals make decisions, they collect evidence for different alternatives over time and take action only when sufficient evidence has been accumulated. We have previously developed a model of memory processing that includes semantic, episodic and working memory in a comprehensive architecture. Here, we describe how this memory mechanism can support decision making when the alternatives cannot be evaluated based on immediate sensory information... (More); Reinforcement learning systems usually assume that a value function is defined over all states (or state-action pairs) that can immediately give the value of a particular state or action. These values are used by a selection mechanism to decide which action to take. In contrast, when humans and animals make decisions, they collect evidence for different alternatives over time and take action only when sufficient evidence has been accumulated. We have previously developed a model of memory processing that includes semantic, episodic and working memory in a comprehensive architecture. Here, we describe how this memory mechanism can support decision making when the alternatives cannot be evaluated based on immediate sensory information alone. Instead we first imagine, and then evaluate a possible future that will result from choosing one of the alternatives. Here we present an extended model that can be used as a model for decision making that depends on accumulating evidence over time, whether that information comes from the sequential attention to different sensory properties or from internal simulation of the consequences of making a particular choice. We show how the new model explains both simple immediate choices, choices that depend on multiple sensory factors and complicated selections between alternatives that require forward looking simulations based on episodic and semantic memory structures. In this framework, vicarious trial and error is explained as an internal simulation that accumulates evidence for a particular choice. We argue that a system like this forms the “missing link” between more traditional ideas of semantic and episodic memory, and the associative nature of reinforcement learning.
(Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/53470e4e-c42d-4a93-ac50-39c6ed40a77a

author

Balkenius, Christian ^LU

; Tjøstheim, Trond A. ^LU ; Johansson, Birger ^LU

; Wallin, Annika ^LU

and Gärdenfors, Peter ^LU

organization

publishing date

2020

type

Contribution to journal

publication status

published

subject

keywords

accumulator model, decision making, episodic memory, memory model, semantic memory

in

Frontiers in Psychology

volume

11

article number

560080

publisher

Frontiers Media S. A.

external identifiers

pmid:33362625
scopus:85098165382

ISSN

1664-1078

DOI

10.3389/fpsyg.2020.560080

project

Ikaros: An infrastructure for system level modelling of the brain

Lund University AI Research

Cognitive modeling

Ethics for autonomous systems/AI

language

English

LU publication?

yes

id

53470e4e-c42d-4a93-ac50-39c6ed40a77a

date added to LUP

2021-01-05 12:57:52

date last changed

2026-01-11 09:27:07

@article{53470e4e-c42d-4a93-ac50-39c6ed40a77a,
  abstract     = {{<p>Reinforcement learning systems usually assume that a value function is defined over all states (or state-action pairs) that can immediately give the value of a particular state or action. These values are used by a selection mechanism to decide which action to take. In contrast, when humans and animals make decisions, they collect evidence for different alternatives over time and take action only when sufficient evidence has been accumulated. We have previously developed a model of memory processing that includes semantic, episodic and working memory in a comprehensive architecture. Here, we describe how this memory mechanism can support decision making when the alternatives cannot be evaluated based on immediate sensory information alone. Instead we first imagine, and then evaluate a possible future that will result from choosing one of the alternatives. Here we present an extended model that can be used as a model for decision making that depends on accumulating evidence over time, whether that information comes from the sequential attention to different sensory properties or from internal simulation of the consequences of making a particular choice. We show how the new model explains both simple immediate choices, choices that depend on multiple sensory factors and complicated selections between alternatives that require forward looking simulations based on episodic and semantic memory structures. In this framework, vicarious trial and error is explained as an internal simulation that accumulates evidence for a particular choice. We argue that a system like this forms the “missing link” between more traditional ideas of semantic and episodic memory, and the associative nature of reinforcement learning.</p>}},
  author       = {{Balkenius, Christian and Tjøstheim, Trond A. and Johansson, Birger and Wallin, Annika and Gärdenfors, Peter}},
  issn         = {{1664-1078}},
  keywords     = {{accumulator model; decision making; episodic memory; memory model; semantic memory}},
  language     = {{eng}},
  publisher    = {{Frontiers Media S. A.}},
  series       = {{Frontiers in Psychology}},
  title        = {{The Missing Link Between Memory and Reinforcement Learning}},
  url          = {{http://dx.doi.org/10.3389/fpsyg.2020.560080}},
  doi          = {{10.3389/fpsyg.2020.560080}},
  volume       = {{11}},
  year         = {{2020}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

The Missing Link Between Memory and Reinforcement Learning