Advanced

MAIA: The role of innate behaviors when picking flowers in Minecraft with Q-learning

Siljebråt, Henrik LU (2015) KOGM20 20151
Cognitive Science
Abstract
Recent advances in reinforcement learning research has achieved human level performance in playing video games (Mnih et al., 2015). This inspired me to understand the methods of reinforcement learning (RL) and investigate whether there is any basis for those methods in neurobiology and animal learning theories. The current study shows how RL is based on theories of animal conditioning and that there is solid evidence for neurobiological correlates with RL algorithms, primarily in the basal ganglia complex. This motivated a simple perceptron-based model of the basal ganglia called Q-tron, which utilizes the Q-learning algorithm. Additionally, I wanted to explore the hypothesis that adding an innate behavior to a Q-learning agent would... (More)
Recent advances in reinforcement learning research has achieved human level performance in playing video games (Mnih et al., 2015). This inspired me to understand the methods of reinforcement learning (RL) and investigate whether there is any basis for those methods in neurobiology and animal learning theories. The current study shows how RL is based on theories of animal conditioning and that there is solid evidence for neurobiological correlates with RL algorithms, primarily in the basal ganglia complex. This motivated a simple perceptron-based model of the basal ganglia called Q-tron, which utilizes the Q-learning algorithm. Additionally, I wanted to explore the hypothesis that adding an innate behavior to a Q-learning agent would increase performance. Thus four different agents were tasked with picking red flowers in the video game Minecraft where performance was measured as quantity of actions needed to pick a flower. A “pure” Q-learner called PQ used only the Q- tron model. MAIA (Minecraft Artificial Intelligence Agent) used the Q-tron model together with an innate behavior causing it to try picking when it saw red. Two mechanisms of the innate behavior were tested, creating MAIA1 and MAIA2, respectively. The fourth agent called random walker (RW) chose actions at random and acted as a baseline performance measure. We show that both MAIA versions have better performance than PQ, and MAIA1 has performance comparable to RW. Additionally, we show a difference in performance between MAIA1 and MAIA2 and argue that this shows the importance of investigations into the precise mechanisms underlying innate behaviors in animals in order to understand learning in general. (Less)
Please use this url to cite or link to this publication:
author
Siljebråt, Henrik LU
supervisor
organization
alternative title
MAIA: Instinkters roll vid blomplockning i Minecraft med Q-inlärning
course
KOGM20 20151
year
type
H2 - Master's Degree (Two Years)
subject
keywords
reinforcement learning, q-learning, minecraft, innate behavior, artificial intelligence, conditioning, neuroscience, dopamine, prediction error, cognition, learning, neural networks, cognitive science
language
English
id
8051962
date added to LUP
2015-11-18 10:55:12
date last changed
2015-11-18 10:55:12
@misc{8051962,
  abstract     = {Recent advances in reinforcement learning research has achieved human level performance in playing video games (Mnih et al., 2015). This inspired me to understand the methods of reinforcement learning (RL) and investigate whether there is any basis for those methods in neurobiology and animal learning theories. The current study shows how RL is based on theories of animal conditioning and that there is solid evidence for neurobiological correlates with RL algorithms, primarily in the basal ganglia complex. This motivated a simple perceptron-based model of the basal ganglia called Q-tron, which utilizes the Q-learning algorithm. Additionally, I wanted to explore the hypothesis that adding an innate behavior to a Q-learning agent would increase performance. Thus four different agents were tasked with picking red flowers in the video game Minecraft where performance was measured as quantity of actions needed to pick a flower. A “pure” Q-learner called PQ used only the Q- tron model. MAIA (Minecraft Artificial Intelligence Agent) used the Q-tron model together with an innate behavior causing it to try picking when it saw red. Two mechanisms of the innate behavior were tested, creating MAIA1 and MAIA2, respectively. The fourth agent called random walker (RW) chose actions at random and acted as a baseline performance measure. We show that both MAIA versions have better performance than PQ, and MAIA1 has performance comparable to RW. Additionally, we show a difference in performance between MAIA1 and MAIA2 and argue that this shows the importance of investigations into the precise mechanisms underlying innate behaviors in animals in order to understand learning in general.},
  author       = {Siljebråt, Henrik},
  keyword      = {reinforcement learning,q-learning,minecraft,innate behavior,artificial intelligence,conditioning,neuroscience,dopamine,prediction error,cognition,learning,neural networks,cognitive science},
  language     = {eng},
  note         = {Student Paper},
  title        = {MAIA: The role of innate behaviors when picking flowers in Minecraft with Q-learning},
  year         = {2015},
}