## 13 Perception, Action, and Utility: The Tangled Skein

### BibTeX

@MISC{Gershman_13perception,,

author = {Samuel J. Gershman and Nathaniel D. Daw},

title = {13 Perception, Action, and Utility: The Tangled Skein},

year = {}

}

### OpenURL

### Abstract

Statistical decision theory seems to offer a clear framework for the integration of perception and action. In particular, it defines the problem of maximizing the utility of one’s decisions in terms of two subtasks: inferring the likely state of the world, and tracking the utility that would result from different candidate actions in different states. This computational-level description underpins more processlevel research in neuroscience about the brain’s dynamic mechanisms for, on the one hand, inferring states and, on the other hand, learning action values. However, a number of different strands of recent work on this more algorithmic level have cast doubt on the basic shape of the decision-theoretic formulation, specifically the clean separation between states ’ probabilities and utilities. We consider the complex interrelationship between perception, action, and utility implied by these accounts. Normative theories of learning and decision making are motivated by a computational-level analysis of the task facing an organism: What should

### Citations

4117 | Reinforcement learning: An introduction
- Sutton, Barto
- 1998
(Show Context)
Citation Context ... Moehlis, Holmes, & Cohen, 2006); of optimal control theory in sensorimotor control (Kording & Wolpert, 2006; Trommershäuser, Maloney, & Landy, 2008); of Bellman’s equation in reinforcement learning (=-=Sutton & Barto, 1998-=-; Dayan & Daw, 2008); of subjective expected utility theory in economics (Von Neumann & Morgenstern, 1947; Savage, 1954); and of foraging theory in behavioral ecology (McNamara & Houston, 1980; Stephe... |

3728 | Prospect Theory: An Analysis of Decisions under Risk” Econometrica
- Kahneman, Tversky
- 1979
(Show Context)
Citation Context ...is means that if the brain is to perform the necessary calculations, it must use some form of approximation. Although statistical decision theory has been criticized on many other grounds (see, e.g., =-=Kahneman & Tversky, 1979-=-; Camerer, 1998), we focus on these aspects because they highlight the algorithmic and implementational commitments of the theory. Statistical decision theory, to be directly implemented in the brain,... |

1850 |
The Foundations of Statistics
- Savage
- 1954
(Show Context)
Citation Context ...aloney, & Landy, 2008); of Bellman’s equation in reinforcement learning (Sutton & Barto, 1998; Dayan & Daw, 2008); of subjective expected utility theory in economics (Von Neumann & Morgenstern, 1947; =-=Savage, 1954-=-); and of foraging theory in behavioral ecology (McNamara & Houston, 1980; Stephens & Krebs, 1986). More recently, neuroscientists have begun to probe the brain for signatures of these assumptions, in... |

1374 |
Statistical Decision Theory and Bayesian Analysis
- Berger
- 1985
(Show Context)
Citation Context ...05). The assumptions of statistical decision theory are, in various forms, pervasive throughout psychology, neuroscience, economics, and ecology (not to mention statistics and engineering; see, e.g., =-=Berger, 1985-=-). They are the basis of signal detection theory and drift-diffusion models in perceptual psychology (Green & Swets, 1966; Bogacz, Brown, Moehlis, Holmes, & Cohen, 2006); of optimal control theory in ... |

1064 |
Signal detection theory and psychophysics
- Green, Swets
- 1974
(Show Context)
Citation Context ...cience, economics, and ecology (not to mention statistics and engineering; see, e.g., Berger, 1985). They are the basis of signal detection theory and drift-diffusion models in perceptual psychology (=-=Green & Swets, 1966-=-; Bogacz, Brown, Moehlis, Holmes, & Cohen, 2006); of optimal control theory in sensorimotor control (Kording & Wolpert, 2006; Trommershäuser, Maloney, & Landy, 2008); of Bellman’s equation in reinforc... |

704 |
A neural substrate of prediction and reward
- Schultz, Dayan, et al.
- 1997
(Show Context)
Citation Context ...a recent review, see Niv, 2009). In particular, prominent accounts of the responses of dopamine neurons suggest that they carry a “reward prediction error” signal for updating such a running average (=-=Schultz, Dayan, & Montague, 1997-=-). The targets of these neurons, notably in the striatum and prefrontal cortex, are believed to be involved in valuation and action selection (Montague, King-Casas, & Cohen, 2006). Insofar as this str... |

539 | Hierarchical models of object recognition in cortex." Nature Neuroscience 2(11): 1019--1025 - Riesenhuber, Poggio - 1999 |

433 |
Foraging theory
- Stephens, Krebs
- 1986
(Show Context)
Citation Context ..., 1998; Dayan & Daw, 2008); of subjective expected utility theory in economics (Von Neumann & Morgenstern, 1947; Savage, 1954); and of foraging theory in behavioral ecology (McNamara & Houston, 1980; =-=Stephens & Krebs, 1986-=-). More recently, neuroscientists have begun to probe the brain for signatures of these assumptions, in particular the neural computations of utilities and posterior probabilities (Glimcher, 2003). We... |

268 |
Risk, ambiguity and the Savage’s axioms
- Ellsberg
- 1961
(Show Context)
Citation Context ...ems to be no candidate for an intermediate stage of pure probability representation over states. A different source of contrary evidence comes from behavioral economics. The classic Ellsberg paradox (=-=Ellsberg, 1961-=-) revealed preferences in human choice behavior that are not probabilistically sophisticated. The example given by Ellsberg involves drawing a ball from an urn containing 30 red balls and 60 black or ... |

193 | Hierarchical Bayesian inference in the visual cortex
- Lee, Mumford
- 2003
(Show Context)
Citation Context ...f utilities for taking the action in each state, weighted by the state’s posterior probability. The generic role commonly imputed to the perceptual system is the computation of this posterior belief (=-=Lee & Mumford, 2003-=-; Knill & Pouget, 2004; Friston, 2005). The assumptions of statistical decision theory are, in various forms, pervasive throughout psychology, neuroscience, economics, and ecology (not to mention stat... |

191 | Reasoning about beliefs and actions under computational resource constraints
- Horvitz
- 1987
(Show Context)
Citation Context ... consciousness of neuroscientists. For example, an extremely rich research tradition in artificial intelligence has examined how to incorporate computational costs into decision-making systems (e.g., =-=Horvitz, 1988-=-; Russell & Wefald, 1991; Zilberstein, 1995). We hope that contact with these ideas will reinvigorate thinking about the organizational principles of the brain. Acknowledgments S.J.G. was supported by... |

172 | Principles of Metareasoning
- Russell, Wefald
- 1989
(Show Context)
Citation Context ...of neuroscientists. For example, an extremely rich research tradition in artificial intelligence has examined how to incorporate computational costs into decision-making systems (e.g., Horvitz, 1988; =-=Russell & Wefald, 1991-=-; Zilberstein, 1995). We hope that contact with these ideas will reinvigorate thinking about the organizational principles of the brain. Acknowledgments S.J.G. was supported by a graduate research fel... |

162 |
Neural correlates of decision variables in parietal cortex. Nature 400
- Platt, Glimcher
- 1999
(Show Context)
Citation Context ...ate nucleus are already reward-modulated, the idea of far-downstream LIP as a pure representation of posterior state probability is dubious. Indeed, other work varying rewarding outcomes for actions (=-=Platt & Glimcher, 1999-=-; Sugrue, Corrado, & Newsome, 2004) shows that neurons in LIP are indeed modulated by the probability and amount of reward expected for an action—probably better thought of as related to expected util... |

149 | The physics of optimal decision making: a formal analysis of models of performance in twoalternative forced-choice tasks
- Bogacz, Brown, et al.
- 2006
(Show Context)
Citation Context ...d ecology (not to mention statistics and engineering; see, e.g., Berger, 1985). They are the basis of signal detection theory and drift-diffusion models in perceptual psychology (Green & Swets, 1966; =-=Bogacz, Brown, Moehlis, Holmes, & Cohen, 2006-=-); of optimal control theory in sensorimotor control (Kording & Wolpert, 2006; Trommershäuser, Maloney, & Landy, 2008); of Bellman’s equation in reinforcement learning (Sutton & Barto, 1998; Dayan & D... |

143 |
Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control
- Daw, Niv, et al.
- 2005
(Show Context)
Citation Context ...at this algorithm converges to the optimal policy and then went on to show that it could reproduce several behavioral signatures of goal-directed behavior commonly associated with the DLPFC (see also =-=Daw, Niv, & Dayan, 2005-=-). The most detailed articulation of common mechanisms for inference and decision, however, has come from the work of Karl Friston and his colleagues (for a recent review, see Friston, 2010). Friston ... |

129 | Neural basis of a perceptual decision in the parietal cortex (area LIP) of the rhesus monkey - Shadlen, Newsome - 2001 |

121 |
A More Robust Definition of Subjective Probability
- Machina, Schmeidler
- 1992
(Show Context)
Citation Context ...eparation between probabilities and utilities. In particular, the posterior must be computed independently of the expected utility. This assumption is sometimes known as probabilistic sophistication (=-=Machina & Schmeidler, 1992-=-; Bossaerts, Preuschoff, & Hsu, 2008). It means that I can state how much N Rabinovich—Principles of Brain Dynamics Rabinovich_9108_013_main.indd 295 1/31/2012 12:07:24 PM296 Samuel J. Gershman and N... |

114 | The Bayesian brain: the role of uncertainty in neural coding and computation. Trends in Neurosciences
- Knill, Pouget
- 2006
(Show Context)
Citation Context ...g the action in each state, weighted by the state’s posterior probability. The generic role commonly imputed to the perceptual system is the computation of this posterior belief (Lee & Mumford, 2003; =-=Knill & Pouget, 2004-=-; Friston, 2005). The assumptions of statistical decision theory are, in various forms, pervasive throughout psychology, neuroscience, economics, and ecology (not to mention statistics and engineering... |

112 | A theory of cortical responses
- Friston
- 2005
(Show Context)
Citation Context ...tate, weighted by the state’s posterior probability. The generic role commonly imputed to the perceptual system is the computation of this posterior belief (Lee & Mumford, 2003; Knill & Pouget, 2004; =-=Friston, 2005-=-). The assumptions of statistical decision theory are, in various forms, pervasive throughout psychology, neuroscience, economics, and ecology (not to mention statistics and engineering; see, e.g., Be... |

112 | Dynamic programming and influence diagrams
- Tatman, Shachter
- 1990
(Show Context)
Citation Context ...tational mechanisms over effectively different likelihood functions. One version of this idea has been explored by Botvinick and An (2009). Building on earlier work in computer science (Cooper, 1988; =-=Tatman & Shachter, 1990-=-), Botvinick and An argued that the dorsolateral prefrontal cortex (DLPFC) could be thought of as computing action Rabinovich—Principles of Brain Dynamics Rabinovich_9108_013_main.indd 304 1/31/2012 1... |

95 | Operational Rationality through Compilation of Anytime Algorithms
- Zilberstein
- 1993
(Show Context)
Citation Context ...example, an extremely rich research tradition in artificial intelligence has examined how to incorporate computational costs into decision-making systems (e.g., Horvitz, 1988; Russell & Wefald, 1991; =-=Zilberstein, 1995-=-). We hope that contact with these ideas will reinvigorate thinking about the organizational principles of the brain. Acknowledgments S.J.G. was supported by a graduate research fellowship from the Na... |

94 | Neuronal correlates of a perceptual decision - Newsome, Britten, et al. - 1989 |

88 |
Matching behavior and the representation of value in the parietal cortex
- Sugrue, Corrado, et al.
- 2004
(Show Context)
Citation Context ...reward-modulated, the idea of far-downstream LIP as a pure representation of posterior state probability is dubious. Indeed, other work varying rewarding outcomes for actions (Platt & Glimcher, 1999; =-=Sugrue, Corrado, & Newsome, 2004-=-) shows that neurons in LIP are indeed modulated by the probability and amount of reward expected for an action—probably better thought of as related to expected utility rather than state probability ... |

86 | Banburismus and the brain: decoding the relationship between sensory stimuli, decisions, and reward
- Gold, Shadlen
- 2002
(Show Context)
Citation Context ...t appeared to integrate motion evidence over time in a manner that predicted the timing of behavioral responses. These and other data were interpreted in terms of classical signal detection concepts (=-=Gold & Shadlen, 2002-=-), with LIP neurons reporting a log-likelihood ratio for motion direction (the “weight of evidence”) based on sensory evidence provided by inputs N Rabinovich—Principles of Brain Dynamics Rabinovich_9... |

76 |
A method for using belief networks as influence diagrams
- Cooper
- 1988
(Show Context)
Citation Context ...derlying computational mechanisms over effectively different likelihood functions. One version of this idea has been explored by Botvinick and An (2009). Building on earlier work in computer science (=-=Cooper, 1988-=-; Tatman & Shachter, 1990), Botvinick and An argued that the dorsolateral prefrontal cortex (DLPFC) could be thought of as computing action Rabinovich—Principles of Brain Dynamics Rabinovich_9108_013_... |

74 | Cortical microstimulation influences perceptual judgements of motion direction - Salzman, Britten, et al. - 1990 |

61 |
Decisions, Uncertainty, and the Brain: The Science of Neuroeconomics
- Glimcher
- 2003
(Show Context)
Citation Context ...phens & Krebs, 1986). More recently, neuroscientists have begun to probe the brain for signatures of these assumptions, in particular the neural computations of utilities and posterior probabilities (=-=Glimcher, 2003-=-). We focus on two aspects of decision theory that have important implications for its implementation in the brain: 1. Decision theory implies a strong distinction or separation between probabilities ... |

60 |
The free-energy principle: A unified brain theory
- Friston
- 2010
(Show Context)
Citation Context ... Daw, Niv, & Dayan, 2005). The most detailed articulation of common mechanisms for inference and decision, however, has come from the work of Karl Friston and his colleagues (for a recent review, see =-=Friston, 2010-=-). Friston has argued that many aspects of neural computation can be subsumed under a single “free-energy principle.” To understand this principle, let us return briefly to the variational approximati... |

57 |
Base rates in category learning
- Kruschke
- 1996
(Show Context)
Citation Context ...ise from the stochastic nature of the sampling process when only a small number of samples are used. Another suggestive experimental finding is an effect known as highlighting (Medin & Bettger, 1991; =-=Kruschke, 1996-=-), which concerns the trial order–dependent dynamics of the learning of predictions about cues. In the balanced version of the experimental design, subjects are presented with three cues (A, B, and C)... |

54 |
D (2006) Bayesian decision theory in sensorimotor control
- Kording, Wolpert
(Show Context)
Citation Context ...of signal detection theory and drift-diffusion models in perceptual psychology (Green & Swets, 1966; Bogacz, Brown, Moehlis, Holmes, & Cohen, 2006); of optimal control theory in sensorimotor control (=-=Kording & Wolpert, 2006-=-; Trommershäuser, Maloney, & Landy, 2008); of Bellman’s equation in reinforcement learning (Sutton & Barto, 1998; Dayan & Daw, 2008); of subjective expected utility theory in economics (Von Neumann & ... |

46 |
P (2004) The learning curve: implications of a quantitative analysis
- CR, Fairhurst, et al.
(Show Context)
Citation Context ...eveals that, at the individual subject level, the dynamics of choice behavior are quite different: Responses appear to change abruptly over the course of learning and never to reach stable asymptote (=-=Gallistel, Fairhurst, & Balsam, 2004-=-). One explanation, proposed by Daw and Courville (2008), is that subjects use a Monte Carlo approximation like particle filtering to approximate the posterior, and abrupt changes arise from the stoch... |

45 | Probabilistic inference for solving discrete and continuous state Markov decision processes
- Toussaint, Storkey
- 2006
(Show Context)
Citation Context ...1 Decision Making as Probabilistic Inference N A rich vein of recent work in machine learning has explored the idea that decision problems can be reframed as inference problems (Dayan & Hinton, 1997; =-=Toussaint & Storkey, 2006-=-; Hoffman, de Freitas, Doucet, & Peters, 2009; Vlassis & Toussaint, 2009; Theodorou, Buchli, & Schaal, 2010). Although these approaches differ in their precise mathematical formulation, the common ide... |

44 |
Using expectation-maximization for reinforcement learning
- Dayan, Hinton
- 1997
(Show Context)
Citation Context ... approximations. 13.5.1 Decision Making as Probabilistic Inference N A rich vein of recent work in machine learning has explored the idea that decision problems can be reframed as inference problems (=-=Dayan & Hinton, 1997-=-; Toussaint & Storkey, 2006; Hoffman, de Freitas, Doucet, & Peters, 2009; Vlassis & Toussaint, 2009; Theodorou, Buchli, & Schaal, 2010). Although these approaches differ in their precise mathematical ... |

40 | Attention modulates responses in the human lateral geniculate nucleus - O'Connor, Fukui, et al. - 2002 |

38 |
Reward timing in the primary visual cortex
- Shuler, Bear
(Show Context)
Citation Context ...cal decision theory. 13.3.3 Challenges The full story, however, is not so simple. First, abundant evidence indicates that reward modulation occurs at all levels of the visual hierarchy, including V1 (=-=Shuler & Bear, 2006-=-; Serences, 2008) and even before that in the lateral geniculate nucleus (Komura et al., 2001; O’Connor, Fukui, Pinsk, & Kastner, 2002). For example, Shuler and Bear (2006) trained rats to associate m... |

28 | Local Bayesian learning with applications to retrospective revaluation and highlighting - Kruschke - 2006 |

24 |
Bounded rationality in individual decision making
- Camerer
- 1998
(Show Context)
Citation Context ... is to perform the necessary calculations, it must use some form of approximation. Although statistical decision theory has been criticized on many other grounds (see, e.g., Kahneman & Tversky, 1979; =-=Camerer, 1998-=-), we focus on these aspects because they highlight the algorithmic and implementational commitments of the theory. Statistical decision theory, to be directly implemented in the brain, requires segre... |

23 |
Imaging valuation models in human choice
- Montague, King-Casas, et al.
- 2006
(Show Context)
Citation Context ... such a running average (Schultz, Dayan, & Montague, 1997). The targets of these neurons, notably in the striatum and prefrontal cortex, are believed to be involved in valuation and action selection (=-=Montague, King-Casas, & Cohen, 2006-=-). Insofar as this strategy already suggests an alternative to the staged representation of probabilities and utilities—because it directly learns utilities in expectation over outcome stochasticity a... |

20 | Bayesian approaches to associative learning: From passive to active learning
- Kruschke
- 2008
(Show Context)
Citation Context ...imal is tasked with estimating the expected utility associated with each choice, where the hidden state represents a scalar association parameter governing the relationship between choice and reward (=-=Kruschke, 2008-=-). Assuming humans and animals are “soft” maximizers, gradual changes in expected utility imply gradual changes in choice behavior. However, careful analysis reveals that, at the individual subject le... |

20 | M (2008) Decision making, movement planning and statistical decision theory. Trends Cogn Sci 12
- Trommershäuser, Maloney, et al.
(Show Context)
Citation Context ...y and drift-diffusion models in perceptual psychology (Green & Swets, 1966; Bogacz, Brown, Moehlis, Holmes, & Cohen, 2006); of optimal control theory in sensorimotor control (Kording & Wolpert, 2006; =-=Trommershäuser, Maloney, & Landy, 2008-=-); of Bellman’s equation in reinforcement learning (Sutton & Barto, 1998; Dayan & Daw, 2008); of subjective expected utility theory in economics (Von Neumann & Morgenstern, 1947; Savage, 1954); and of... |

19 | The pigeon as particle filter - Daw, Courville - 2008 |

19 |
Representation and timing in theories of the dopamine system
- Daw, Courville, et al.
- 2006
(Show Context)
Citation Context ...tate inference systems discussed earlier: They map a given (assumed known) perceptual state and action to its utility (the latter in expectation over outcome stochasticity). Indeed, numerous authors (=-=Daw, Courville, & Touretzky, 2006-=-; Dayan & Daw, 2008; Braun, Mehring, & Wolpert, 2010; Gershman & Niv, 2010; Rao, 2010) have argued that the full problem of decision making under perceptual uncertainty—when both the perceptual state ... |

18 | Reinforcement learning in the brain
- Niv
- 2009
(Show Context)
Citation Context ...rning rules for taking a running average over experienced outcomes’ utilities, so as to learn an estimate of their expectation; this is the purview of reinforcement learning (for a recent review, see =-=Niv, 2009-=-). In particular, prominent accounts of the responses of dopamine neurons suggest that they carry a “reward prediction error” signal for updating such a running average (Schultz, Dayan, & Montague, 19... |

18 | One and done? Optimal decisions from very few samples - Vul, Goodman, et al. - 2014 |

16 | Model-free Reinforcement Learning as Mixture Learning
- Vlassis, Toussaint
- 2009
(Show Context)
Citation Context ...k in machine learning has explored the idea that decision problems can be reframed as inference problems (Dayan & Hinton, 1997; Toussaint & Storkey, 2006; Hoffman, de Freitas, Doucet, & Peters, 2009; =-=Vlassis & Toussaint, 2009-=-; Theodorou, Buchli, & Schaal, 2010). Although these approaches differ in their precise mathematical formulation, the common idea is that by transforming the utility function appropriately, one can tr... |

15 |
Retrospective and prospective coding for predicted reward in the sensory thalamus. Nature 2001;412:546–549. [PubMed: 11484055] Kovacs G, Vogels R, Orban G. Cortical correlate of pattern backward masking
- Komura, Tamura, et al.
(Show Context)
Citation Context ...dant evidence indicates that reward modulation occurs at all levels of the visual hierarchy, including V1 (Shuler & Bear, 2006; Serences, 2008) and even before that in the lateral geniculate nucleus (=-=Komura et al., 2001-=-; O’Connor, Fukui, Pinsk, & Kastner, 2002). For example, Shuler and Bear (2006) trained rats to associate monocular stimulation with liquid reward and found that V1 neurons altered their firing patter... |

14 |
Action and behavior: A free-energy formulation
- Friston, Daunizeau, et al.
- 2010
(Show Context)
Citation Context ... by minimizing free energy (with respect to the model parameters) is equivalent to maximizing a lower bound on the log marginal likelihood. Friston and colleagues (Friston, Daunizeau, & Kiebel, 2009; =-=Friston, Daunizeau, Kilner, & Kiebel, 2010-=-) formulate the decision optimization problem in these terms. There are at least two separable claims here. The technical thrust of the work is similar to the ideas discussed earlier: If one specifies... |

14 | Testing the efficiency of sensory coding with optimal stimulus ensembles. Neuron - Machens, Gollisch, et al. - 2005 |

14 |
The application of statistical decision theory to animal behaviour
- McNamara, Houston
- 1980
(Show Context)
Citation Context ...t learning (Sutton & Barto, 1998; Dayan & Daw, 2008); of subjective expected utility theory in economics (Von Neumann & Morgenstern, 1947; Savage, 1954); and of foraging theory in behavioral ecology (=-=McNamara & Houston, 1980-=-; Stephens & Krebs, 1986). More recently, neuroscientists have begun to probe the brain for signatures of these assumptions, in particular the neural computations of utilities and posterior probabilit... |

13 | Goal-directed decision making in prefrontal cortex: a computational framework - Botvinick, An - 2009 |