The Missing Link Between Memory and Reinforcement Learning
Frontiers in Psychology 11 (2020)
Abstract
Reinforcement learning systems usually assume that a value function is defined over all states that can immediately give the value of a particular state or action. These values are used by a selection mechanism to decide which action to take. In contrast, when humans and animals make decisions, they collect evidence for different alternatives over time and take action only when sufficient evidence has been accumulated. We have previously developed a model of memory processing that includes semantic, episodic and working memory in a comprehensive architecture. Here, we describe how this memory mechanism can support decision making when the alternatives cannot be evaluated based on immediate sensory information alone. Instead we first imagine, and then evaluate a possible future that will result from choosing one of the alternatives. Here we present an extended model that can be used as a model for decision making that depends on accumulating evidence over time, whether that information comes from the sequential attention to different sensory properties or from internal simulation of the consequences of making a particular choice. We show how the new model explains both simple immediate choices, choices that depend on multiple sensory factors and complicated selections between alternatives that require forward looking simulations based on episodic and semantic memory structures. In this framework, vicarious trial and error is explained as an internal simulation that accumulates evidence for a particular choice. We argue that a system like this forms the “missing link” between more traditional ideas of semantic and episodic memory, and the associative nature of reinforcement learning.Author Profiles
DOI
10.3389/fpsyg.2020.560080
My notes
Similar books and articles
Making decisions about the future: Regret and the cognitive function of episodic memory.Christoph Hoerl & Teresa McCormack - 2016 - In Kourken Michaelian, Stanley Klein & Karl Szpunar (eds.), Seeing the future: Theoretical perspectives on future-oriented mental time travel. Oxford University Press. pp. 241-266.
A model for memory systems based on processing modes rather than consciousness.Katharina Henke - 2010 - Nature 11.
Analysis on Mental Structures in Language Learning.Ya-Ping Cui - 2005 - Philosophy of the Social Sciences 35 (3):147-150.
Remembering past experiences: episodic memory, semantic memory and the epistemic asymmetry.Christoph Hoerl - 2018 - In Kourken Michaelian, Dorothea Debus & Denis Perrin (eds.), New Directions in the Philosophy of Memory. Routledge. pp. 313-328.
Episodic memory in semantic dementia: Implications for the roles played by the perirhinal and hippocampal memory systems in new learning.Kim S. Graham & John R. Hodges - 1999 - Behavioral and Brain Sciences 22 (3):452-453.
Enhanced action control as a prior function of episodic memory.Philipp Rau & George Botterill - 2018 - Behavioral and Brain Sciences 41.
What is episodic memory if it is a natural kind?Sen Cheng & Markus Werning - 2016 - Synthese 193 (5):1345-1385.
SAwSu: An Integrated Model of Associative and Reinforcement Learning.Vladislav D. Veksler, Christopher W. Myers & Kevin A. Gluck - 2014 - Cognitive Science 38 (3):580-598.
Is mental time travel real time travel?Michael Barkasi & Melanie G. Rosen - 2020 - Philosophy and the Mind Sciences 1 (1):1-27.
The philosophy of memory today and tomorrow: Editors' introduction.Kourken Michaelian, Dorothea Debus & Denis Perrin - 2018 - In Kourken Michaelian, Dorothea Debus & Denis Perrin (eds.), New Directions in the Philosophy of Memory. Routledge. pp. 1-9.
The making of a memory mechanism.Carl F. Craver - 2003 - Journal of the History of Biology 36 (1):153-95.
Analytics
Added to PP
2020-12-22
Downloads
14 (#733,752)
6 months
3 (#227,700)
2020-12-22
Downloads
14 (#733,752)
6 months
3 (#227,700)
Historical graph of downloads
Author Profiles
References found in this work
Deictic codes for the embodiment of cognition.Dana H. Ballard, Mary M. Hayhoe, Polly K. Pook & Rajesh P. N. Rao - 1997 - Behavioral and Brain Sciences 20 (4):723-742.
The time course of perceptual choice: The leaky, competing accumulator model.Marius Usher & James L. McClelland - 2001 - Psychological Review 108 (3):550-592.
Episodic future thinking.Cristina M. Atance & Daniela K. O'Neill - 2001 - Trends in Cognitive Sciences 5 (12):533-539.
Norepinephrine ignites local hotspots of neuronal excitation: How arousal amplifies selectivity in perception and memory.Mara Mather, David Clewett, Michiko Sakaki & Carolyn W. Harley - 2016 - Behavioral and Brain Sciences 39:1-100.
The goal-gradient hypothesis and maze learning.C. L. Hull - 1932 - Psychological Review 39 (1):25-43.