DMCA
Integrating memories to guide decisions ScienceDirect (2015)
BibTeX
@MISC{Shohamy15integratingmemories,
author = {Daphna Shohamy and Nathaniel D Daw},
title = {Integrating memories to guide decisions ScienceDirect},
year = {2015}
}
OpenURL
Abstract
Adaptive decisions are guided by past experience. Yet, decisions are often made between alternatives that have not been directly experienced before, requiring the integration of memories across multiple past events. We review emerging findings supporting at least two seemingly distinct mechanisms for how the brain draws on memories in the service of choice. Prospective integration is triggered when a new decision is faced, allowing inferences to be drawn on the spot. A complementary retrospective mechanism integrates existing memories into a network of related experiences before a decision is actually faced. We discuss evidence supporting each of these mechanisms and the implications for understanding the role of memory in decision-making. Introduction Memory is central to adaptive behavior. To make flexible decisions, organisms must draw on past experiences to anticipate and evaluate the outcomes of different candidate courses of action. In short, choice depends on memory. Here we review a range of research in humans and animals concerning how memories are retrieved and used to guide value-based decisions. We focus particularly on questions about when and in what order representations of previous events are built and accessed, and how this subserves the computation of decision variables to guide flexible choice. A key issue in decision making has been distinguishing different systems for evaluating options. It is now widely appreciated that seemingly the same behaviour -a lever press or a turn in a maze -may in different circumstances arise from a number of different systems that are psychologically, neurally, and computationally distinct. Early research focused on simple ''model-free'' incremental learning, associated with the midbrain dopamine system This review concerns a different -though itself not necessarily unitary -class of decisions, which has been defined, operationally, by challenging organisms with choices that cannot be solved by simple model-free learning There are many examples of humans and animals demonstrating such integrative reasoning in the laboratory. This is the defining feature of a category of behavior known as goal-directed action The essential feature of this task, and many others like it, is that subjects integrate information about A's and B's associations that were obtained in separate episodes. Participants' behavior on these sorts of decisions clearly reflects processes that are beyond the simple stamping-in of any previous choice. Instead, they reflect what appear to be inferences based upon integration of the different elements. A similar structure and logic are characteristic of a variety of other experiments, from spatial navigation (latent learning, shortcuts) [4] and instrumental conditioning (reward devaluation) But how, exactly, are such inferences produced? It is often assumed that this sort of behavior demonstrates something like inferential reasoning, conducted at the time of the decision. Yet this need not be the case. Instead, it is possible that such behavior reflects integration processes that happened earlier, before a decision was faced. Choice behavior itself does not generally reveal when the computations that produced it took place. Similarly, neuroscientific data -including pretraining lesions Prospective integration The hypothesis, implicit or explicit, behind much work on flexible decision making, is that the decision itself triggers the computation to evaluate the options, in which subjects combine the relevant associations ''just in time''. This is a strategy we broadly refer to as prospection In some cases, it does seem almost a foregone conclusion that subjects must evaluate an option at decision time. For instance, when a truly novel option is introduced -the very first time you face an exotic dish like ''tea jelly'' or ''pea mousse' ' [12 ], whose value might be imagined only from the properties of its components -it seems unlikely that the brain could possibly have precomputed its value. Indeed, upon evaluating tea jelly for the first time, fMRI adaptation effects demonstrate that people access the separate component elements, tea and jelly. Other examples from both human neuroimaging and rodent neural inactivation show that integrative, inferential reasoning in tasks similar to that of Retrospective integration Despite much evidence for prospective integration mechanisms at decision time, there are also many intriguing hints that seemingly similar behaviors can be produced in a different way. Interestingly -returning to the case of position representations in the rodent hippocampusrepresentations of positions other than the current one do not always run ahead of the animal. Indeed, similar retrospective ''replay'' phenomena have also been reported, in which place cells represent positions where the animal has been in the past, including reverse replay behind the animal In a spatial task, the integration problem (analogous to that of Integrating memories to guide decisions Shohamy and Daw 87 Schematic of possible mechanisms underlying integration of memories to guide decisions. When confronted with a new decision which cannot be wholly based on past rewards, such as predicting whether the blue square will lead to reward or not, participants' behavior tends to reflect the integration of memory for past relevant events. This integration can happen via two distinct mechanisms. (a) One possibility is that at the time of making the decision, participants retrieve relevant memories and use them to engage in prospective reasoning about the likely outcomes of their decisions. (b) Another possibility is that the overlap in the memories themselves triggers integration of distinct episodes during learning/encoding, before a decision is ever confronted. In this sort of retrospective mechanism, the attribution of reward value to the blue square would have already been in place before a decision was ever required. www.sciencedirect.com Current Opinion in Behavioral Sciences 2015, 5:85-90 maze arm All of these dynamics strongly suggest a mechanism for the integrative propagation of reward information opportunistically and ahead of time, rather than on-demand when facing a decision. However, such retrospective activity in rodents has not directly been linked to the solution of behavioral tasks that demonstrably require integration. Addressing this gap, retrospective integration has also been examined in humans in the context of the sensory preconditioning task of Similar integration might also be supported by re-activating raw experience from Phases I and II during rest periods or sleep following Phase II, driving new learning about A via offline integration. In machine learning, such replay-driven learning has been proposed in an architecture called DYNA Conclusions and future questions Altogether, data support the idea that memories are retrieved and integrated to construct decision variables at a variety of times, ranging from the time of encoding to the time of decision. These mechanisms are clearly not mutually exclusive, and indeed there is good evidence supporting each of them. This raises a larger question, though: how does the brain decide which sorts of strategies to evoke under which circumstances? Earlier work has considered rational accounts of the tradeoff between model-based and model-free decision making in terms of the relative costs (e.g., delay) and benefits (e.g., better chance of gaining rewards) of prospection at decision time, a tradeoff which will vary depending on issues like the amount of time pressure or training If replay may also happen between experience and the decisions it supports, then analogous questions -when to replay? which events to replay? -might similarly be understood in terms of rational analysis of costs and benefits. Such prioritization has been the product of some work in computer science A second question, which we largely skirted, is what sorts of memory representations are operated on by the replay and preplay operations we have considered. Classic work on model-based decision making envisions that it operates over semantic representations (like maps), which may themselves arise from the integration or average over many distinct experiences