Results 1 - 10
of
10
The rat as particle filter
"... The core tenet of Bayesian modeling is that subjects represent beliefs as distributions over possible hypotheses. Such models have fruitfully been applied to the study of learning in the context of animal conditioning experiments (and analogously designed human learning tasks), where they explain ph ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
The core tenet of Bayesian modeling is that subjects represent beliefs as distributions over possible hypotheses. Such models have fruitfully been applied to the study of learning in the context of animal conditioning experiments (and analogously designed human learning tasks), where they explain phenomena such as retrospective revaluation that seem to demonstrate that subjects entertain multiple hypotheses simultaneously. However, a recent quantitative analysis of individual subject records by Gallistel and colleagues cast doubt on a very broad family of conditioning models by showing that all of the key features the models capture about even simple learning curves are artifacts of averaging over subjects. Rather than smooth learning curves (which Bayesian models interpret as revealing the gradual tradeoff from prior to posterior as data accumulate), subjects acquire suddenly, and their predictions continue to fluctuate abruptly. These data demand revisiting the model of the individual versus the ensemble, and also raise the worry that more sophisticated behaviors thought to support Bayesian models might also emerge artifactually from averaging over the simpler behavior of individuals. We suggest that the suddenness of changes in subjects ’ beliefs (as expressed in conditioned behavior) can be modeled by assuming they are conducting inference using sequential Monte Carlo sampling with a small number of samples — one, in our simulations. Ensemble behavior resembles exact Bayesian models since, as in particle filters, it averages over many samples. Further, the model is capable of exhibiting sophisticated behaviors like retrospective revaluation at the ensemble level, even given minimally sophisticated individuals that do not track uncertainty from trial to trial. These results point to the need for more sophisticated experimental analysis to test Bayesian models, and refocus theorizing on the individual, while at the same time clarifying why the ensemble may be of interest. 1
Trial-by-trial data analysis using computational models
, 2009
"... In numerous and high-profile studies, researchers have recently begun to integrate computational models ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In numerous and high-profile studies, researchers have recently begun to integrate computational models
ConneCtions Between Computational and neuroBiologiCal perspeCtives on decision making -- decision theory, . . .
, 2008
"... ..."
Embodied Inference: or “I think therefore I am, if I am what I think”
"... This chapter considers situated and embodied cognition in terms of the free-energy principle. The free-energy formulation starts with the premise that biological agents must actively resist a natural tendency to disorder. It appeals to the idea that agents are essentially inference machines that ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This chapter considers situated and embodied cognition in terms of the free-energy principle. The free-energy formulation starts with the premise that biological agents must actively resist a natural tendency to disorder. It appeals to the idea that agents are essentially inference machines that
Free Energy, Value, and Attractors
, 2012
"... It has been suggested recently that action and perception can be understood as minimising the free energy of sensory samples. This ensures that agents sample the environment to maximise the evidence for their model of the world, such that exchanges with the environment are predictable and adaptive ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
It has been suggested recently that action and perception can be understood as minimising the free energy of sensory samples. This ensures that agents sample the environment to maximise the evidence for their model of the world, such that exchanges with the environment are predictable and adaptive. However, the free energy account does not invoke reward or cost-functions from reinforcement-learning and optimal control theory. We therefore ask whether reward is necessary to explain adaptive behaviour. The free energy formulation uses ideas from statistical physics to explain action in terms of minimising sensory surprise. Conversely, reinforcement-learning has its roots in behaviourism and engineering and assumes that agents optimise a policy to maximise future reward. This paper tries to connect the two formulations and concludes that optimal policies correspond to empirical priors on the trajectories of hidden environmental states, which compel agents to seek out the (valuable) states they expect to encounter.
Memory & Cognition
"... doi:10.3758/MC.36.8.1460 Ratio and difference comparisons of expected reward in decision-making tasks ..."
Abstract
- Add to MetaCart
doi:10.3758/MC.36.8.1460 Ratio and difference comparisons of expected reward in decision-making tasks
Neurocomputational mechanisms of reinforcement-guided learning in humans: A review
"... Adapting decision-making according to dynamic and probabilistic changes in action-reward contingencies is critical for survival in a competitive and resource-limited world. Much research has focused on elucidating the neural systems and computations that underlie how the brain identifies whether th ..."
Abstract
- Add to MetaCart
Adapting decision-making according to dynamic and probabilistic changes in action-reward contingencies is critical for survival in a competitive and resource-limited world. Much research has focused on elucidating the neural systems and computations that underlie how the brain identifies whether the consequences of actions are relatively good or bad. In contrast, less empirical research has focused on the mechanisms by which reinforcements might be used to guide decision-making. Here I review recent studies that have attempted to bridge this gap by characterizing how humans use reward information to guide and optimize decision-making. Regions that have been implicated in reinforcement processing, including the striatum, orbitofrontal cortex, and anterior cingulate, also seem to mediate how reinforcements are used to adjust subsequent decision-making. This research provides insights into why the brain devotes resources to evaluating reinforcements, and suggests a direction for future research, from studying the mechanisms of reinforcement processing to the mechanisms of reinforcement learning.
Policy Gradient Coagent Networks
"... We present a novel class of actor-critic algorithms for actors consisting of sets of interacting modules. We present, analyze theoretically, and empirically evaluate an update rule for each module, which requires only local information: the module’s input, output, and the TD error broadcast by a cri ..."
Abstract
- Add to MetaCart
We present a novel class of actor-critic algorithms for actors consisting of sets of interacting modules. We present, analyze theoretically, and empirically evaluate an update rule for each module, which requires only local information: the module’s input, output, and the TD error broadcast by a critic. Such updates are necessary when computation of compatible features becomes prohibitively difficult and are also desirable to increase the biological plausibility of reinforcement learning methods. 1

