## Semi-rational Models of Conditioning: The Case of Trial Order (2007)

### Cached

### Download Links

Citations: | 4 - 2 self |

### BibTeX

@MISC{Daw07semi-rationalmodels,

author = {Nathaniel D. Daw and Aaron C. Courville and Peter Dayan},

title = {Semi-rational Models of Conditioning: The Case of Trial Order},

year = {2007}

}

### OpenURL

### Abstract

Bayesian treatments of animal conditioning start from a generative model that specifies precisely a set of assumptions about the structure of the learning task. Optimal rules for learning are direct mathematical consequences of these assumptions. In terms of Marr’s (1982) levels of analyses, the main task at the computational level

### Citations

9054 | Maximum likelihood from incomplete data via the EM algorithm
- Dempster, Laird, et al.
- 1977
(Show Context)
Citation Context ... determined solely based on the chosen stimulus’ weight. Learning about the weight from the outcome then depends on unobserved information: which stimulus was chosen. Expectationmaximization methods (=-=Dempster et al., 1977-=-; Griffiths and Yuille, this volume) address this problem by repeatedly alternating two steps: estimating the hidden information based on the current beliefs about the weights (‘E step’), then updatin... |

2486 |
A New Approach to Linear Filtering and Prediction Problems. Transaction of the ASME
- Kalman
- 1960
(Show Context)
Citation Context ...trix Σt encoding the uncertainty around that mean. Because of the Gaussian assumptions, these quantities can tractably be updated trial by trial according to Bayes theorem, which here takes the form (=-=Kalman, 1960-=-): wˆ ˆ ( ˆ t+ wt t rt wt t) = + − ⋅ 1 Σ x ∑t+1 = ∑t- κtxt∑ t + σdI T with Kalman gain vector kt = ∑t xt /( xt ∑t xt + o ) . Note that the update rule for the mean takes the form of the Rescorla–Wagne... |

1730 | Aspects of the theory of syntax - Chomsky - 1965 |

904 | A behavioral model of rational choice - Simon - 1955 |

869 | An Introduction to Variational Methods for Graphical Models
- Jordan, Ghahramani, et al.
- 1999
(Show Context)
Citation Context ...be evident in patterns of variability over trials or subjects, which is not the focus of the present work. We will focus instead on deterministic simplifications of difficult mathematical forms (e.g. =-=Jordan et al., 1999-=-), such as the usage of lower bounds or maximum likelihood approximations. One critical feature of these approximations is that they often involve steps that have the consequence of discarding relevan... |

843 |
Adaptive mixtures of local experts
- Jacobs, Jordan
- 1991
(Show Context)
Citation Context ...e stimuli are (1)s19-Charter&Oaksford-Chap19 11/5/07 11:22 AM Page 430 430 SEMI-RATIONAL MODELS OF CONDITIONING: THE CASE OF TRIAL ORDER treated as competing predictors (rather than cooperating ones: =-=Jacobs et al., 1991-=-a,b). For instance, one alternative formulation is that of an additive mixture of Gaussians, which uses an extra vector of parameters p t Œq to capture the competition: Pr ( t | xt, wt, σo, πt) ∝ ∑π t... |

805 | A view of the EM algorithm that justifies incremental, sparse, and other variants
- Neal, Hinton
- 1998
(Show Context)
Citation Context ...imate to be true (‘M step’). This process can be understood to perform coordinate ascent on a particular error function, and is guaranteed to reduce (or at least not increase) the error at each step (=-=Neal and Hinton, 1998-=-). An online form of EM is appropriate for learning in the generative model of Equation 3. (Assume for simplicity that the weights do not change, i.e., that Equation 4 obtains.) At each trial, the E s... |

737 | On sequential monte carlo sampling methods for bayesian
- Doucet, Godsill, et al.
- 2000
(Show Context)
Citation Context ...rinciples of reasoning. Tools for inferential approximation may crudely be split into two categories, though these are often employed together. Monte Carlo techniques such as particle filtering (e.g. =-=Doucet et al., 2000-=-) approximate statistical computations by averaging over random samples. While these methods may be relevant to psychological modeling, the hallmarks of their usage would mainly be evident in patterns... |

281 |
A family of algorithms for approximate Bayesian inference. Doctoral dissertation
- Minka
- 2001
(Show Context)
Citation Context ...rsive form, but simplify the posterior distribution after each update to enable efficient approximate computation of subsequent updates. Such methods are broadly known as assumed density filters (see =-=Minka, 2001-=-, who also discusses issues of trial ordering). Typically, the posterior distribution is chosen to have a simple functional form (e.g. Gaussian, with a diagonal covariance matrix), and to have its par... |

199 | Task decomposition through competition in a modular connectionist architecture: what and where vision tasks
- Jacobs, Jordan, et al.
- 1991
(Show Context)
Citation Context ...e stimuli are (1)s19-Charter&Oaksford-Chap19 11/5/07 11:22 AM Page 430 430 SEMI-RATIONAL MODELS OF CONDITIONING: THE CASE OF TRIAL ORDER treated as competing predictors (rather than cooperating ones: =-=Jacobs et al., 1991-=-a,b). For instance, one alternative formulation is that of an additive mixture of Gaussians, which uses an extra vector of parameters p t Œq to capture the competition: Pr ( t | xt, wt, σo, πt) ∝ ∑π t... |

195 | A theory of attention: Variations in the associability of stimulus with reinforcement
- Mackintosh
- 1975
(Show Context)
Citation Context ... 440 440 SEMI-RATIONAL MODELS OF CONDITIONING: THE CASE OF TRIAL ORDER example of a particularly relevant learning algorithm and is related to a number of important behavioral models (Kruschke, 2001; =-=Mackintosh, 1975-=-), though it does have some empirical shortcomings related to its assumptions about cue combination (Dayan and Long, 1998). Recall that, according to this generative model, one stimulus out of those p... |

152 | Products of experts
- Hinton
- 1999
(Show Context)
Citation Context ...ining other phenomena such as overshadowing and inhibitory conditioning, and ultimately favors alternatives to Equation 3 in which cues cooperate to produce the net observation (Dayan and Long, 1998; =-=Hinton, 1999-=-; Jacobs et al., 1991a). Despite this failure, the competitive model does exhibit forward blocking, albeit through a responsibility-sharing mechanism (Mackintosh, 1975) rather than a weight-sharing me... |

144 |
Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control
- Daw, Niv, et al.
- 2005
(Show Context)
Citation Context ...hat subjects actually compute both forms simultaneously, but then reconcile the answers, making an adaptive decision how much to trust each, much as in other cases of Bayesian evidence reconcilation (=-=Daw et al., 2005-=-). The ‘exact’ computation might not always be the most accurate, if in biological tissue the extra computations incur additional computational noise; it might therefore be worthwhile to expend extra ... |

133 | Cortical substrates for exploratory decisions in humans - Daw, O’Doherty, et al. |

132 | Keeping neural networks simple by minimizing the description length of the weights - Hinton, Camp - 1993 |

127 | Neurophilosophy. Toward a Unified Science of Mind-Brain - Churchland - 1986 |

111 | Bayesian Qlearning
- Dearden, Friedman, et al.
- 1998
(Show Context)
Citation Context ...ng the unobserved responsibilities, and then conditioning on them as though they were observed datas19-Charter&Oaksford-Chap19 11/5/07 11:22 AM Page 445 DISCUSSION 445 (see also the mixture update of =-=Dearden et al. 1998-=-). Since it conducts inference using synthetic in place of observed quantities, this rule would have the flavor of Kruschke’s locally Bayesian scheme, and indeed would be a route to find statistically... |

98 |
Forward and backward blocking in human contingency judgment
- Shanks
- 1985
(Show Context)
Citation Context ...(2006), who argued against pure Bayesian learning models in favor of a particular heuristic treatment based on the effects of trial ordering in tasks such as backward blocking (Lovibond et al., 2003; =-=Shanks, 1985-=-; Wasserman and Berglan, 1998) and highlighting (Kruschke, 2003, 2006; Medin and Bettger, 1991). His ‘locally Bayesian’ model, while drawing on Bayesian methods, neither corresponds to exact inference... |

79 | Toward a unified model of attention in associative learning
- Kruschke
- 2001
(Show Context)
Citation Context ...maining probability) and its weight alone provides the mean for the whole reward. This is known to relate to a family of models of animal conditioning due to Mackintosh (1975; see Dayan & Long, 1998; =-=Kruschke, 2001-=-), and formalizes in a normative manner the notion in those models of cuespecific attentional weightings, with different stimuli having different degrees of influence over the predictions. Finally, th... |

64 | The integrative activity of the brain - Konorski - 1967 |

62 | Bayesian methods for mixtures of experts
- Waterhouse, MacKay, et al.
- 1996
(Show Context)
Citation Context ...gates only a point, maximum likelihood, estimate of the posterior distribution over the weights. This could be rectified by adopting a so-called ensemble learning approach (Hinton and van Camp, 1993; =-=Waterhouse et al., 1996-=-), in which a full (approximate) distribution over the learned parameters is maintained and propagated, rather than just a point estimate. In ensemble learning, this distribution is improved by iterat... |

51 |
Problem structure and the use of base-rate information from experience
- Medin, Edelson
- 1988
(Show Context)
Citation Context ...onal Kalman filter example of Figure 19.3a, the highlighting effect here doesn’t arise until the second block of trials. This means that this EM model doesn’t explain the ‘inverse base rate effect,’ (=-=Medin and Edelson, 1988-=-) which is the highlighting effect shown even using only the first block when R predominates. One reason for this, in turn, is the key competitive feature of this rule, that the predictions made by ea... |

46 | Attention in learning
- Kruschke
- 2003
(Show Context)
Citation Context ...avor of a particular heuristic treatment based on the effects of trial ordering in tasks such as backward blocking (Lovibond et al., 2003; Shanks, 1985; Wasserman and Berglan, 1998) and highlighting (=-=Kruschke, 2003-=-, 2006; Medin and Bettger, 1991). His ‘locally Bayesian’ model, while drawing on Bayesian methods, neither corresponds to exact inference nor is motivated or justified as an approximation to the ideal... |

43 |
Learning and selective attention
- Dayan, Kakade, et al.
- 2000
(Show Context)
Citation Context ...roximation. We show how primacy effects, as seen in highlighting, qualitatively characterize a number of simplified inference schemes. We start by describing the Kalman filter model of con ditioning (=-=Dayan et al., 2000-=-), which arises as the exact inference process associated with an analytically tractable, but highly simplified, Bayesian model of change. We show that this model leads to certain trial order effects,... |

38 | Biological significance in forward and backward blocking: Resolution of a discrepancy between animal conditioning and human causal judgment
- RR, Matute
- 1996
(Show Context)
Citation Context ...r to be neurally distinguishable (Balleine and Killcross, 2006). In fact, there is evidence for similar behavioral dissociations coming from attempts to demonstrate retrospective revaluation in rats (=-=Miller and Matute, 1996-=-). When training is conducted directly in terms of stimulus-reinforcer pairings, no retrospective revaluation is generally seen (as with our diagonalized covariance Kalman filter), but revaluation doe... |

38 |
Backward blocking and recovery from overshadowing in human causal judgement: The role of within-compound associations. The Quarterly
- Wasserman, Berglan
- 1998
(Show Context)
Citation Context ...gued against pure Bayesian learning models in favor of a particular heuristic treatment based on the effects of trial ordering in tasks such as backward blocking (Lovibond et al., 2003; Shanks, 1985; =-=Wasserman and Berglan, 1998-=-) and highlighting (Kruschke, 2003, 2006; Medin and Bettger, 1991). His ‘locally Bayesian’ model, while drawing on Bayesian methods, neither corresponds to exact inference nor is motivated or justifie... |

35 |
Eds.), Punishment and aversive behavior (pp
- Kamin
- 1969
(Show Context)
Citation Context ...to B alone is then tested. Famously, predictions of R given B probes are attenuated (blocked) when the AB→R training is preceded by a set of A→R trials, in which A alone is paired with reinforcement (=-=Kamin, 1969-=-). One intuition for this forward blocking effect is that if reinforcement is explicable on the basis of A alone, then the AB→R trials do not provide evidence that B is also associated with reinforcem... |

35 | A theory of Pavlovian conditioning and the effectiveness of reinforcement and non-reinforcement - Rescorla, Wagner - 1972 |

32 | Bayesian theories of conditioning in a changing world - Courville, Daw, et al. - 2006 |

29 | Acquisition and extinction in autoshaping
- Kakade, Dayan
- 2002
(Show Context)
Citation Context ...del precludes any effect of trial ordering. Kruschke (2006) focuses his critique on exactly this issue. However, the assumption that trials are IID is a poor match to a typically nonstationary world (=-=Kakade and Dayan, 2002-=-). Instead, most conditioning tasks (and also the real-world foraging or inference scenarios they stylize), involve some sort of change in the contingencies of interest (in this case, the coupling bet... |

28 | Local Bayesian learning with applications to retrospective revaluation and highlighting
- Kruschke
- 2006
(Show Context)
Citation Context ...blocking (e.g. Lovibond et al., 2003). Since forward and backward blocking just involve a rearrangement of the same trials, this asymmetry is a noteworthy demonstration of sensitivity to trial order (=-=Kruschke, 2006-=-), and thus refutation of the IID model. The simulation results in figure 19.1a confirm that forward and backward blocking are equally strong under the IID Kalman filter model. It may not, however, be... |

26 | Explaining away in weight space
- Dayan, Kakade
- 2000
(Show Context)
Citation Context ...eight B 1.5 1 0.5 0s19-Charter&Oaksford-Chap19 11/5/07 11:22 AM Page 434 434 SEMI-RATIONAL MODELS OF CONDITIONING: THE CASE OF TRIAL ORDER obvious how the rule accomplishes retrospective revaluation (=-=Kakade and Dayan, 2001-=-). Figure 19.1b illustrates the posterior distribution over wA and wB following AB→R training in backward blocking. The key point is that they are anticorrelated, since together they should add up to ... |

24 | Model uncertainty in classical conditioning - Courville, Daw, et al. - 2004 |

20 | Vision: A computational approach - Marr - 1982 |

19 |
Parallel incentive processing: an integrated view of amygdala function
- Balleine, Killcross
- 2006
(Show Context)
Citation Context ... filter’s representation of interstimulus covariance) and a simpler stimulus-reward one (perhaps related to our diagonalized Kalman filter); such processes also appear to be neurally distinguishable (=-=Balleine and Killcross, 2006-=-). In fact, there is evidence for similar behavioral dissociations coming from attempts to demonstrate retrospective revaluation in rats (Miller and Matute, 1996). When training is conducted directly ... |

19 |
The role of learning in motivation
- Dickinson, Balleine
- 2002
(Show Context)
Citation Context ...mental conditioning is an analogous division between an elaborate, cognitive, (and likely computationally noisy) ‘goal-directed’ pathway, and a simpler (but statistically inefficient) ‘habitual’ one (=-=Dickinson and Balleine, 2002-=-). In this setting, the idea of normatively trading off approximate valueinference approaches characteristic of the systems has been formalized in terms of their respective uncertainties, and explains... |

16 | Similarity and discrimination in classical conditioning: A latent variable account - Courville, Daw - 2004 |

15 |
Forward and backward blocking of causal judgement is enhanced by additivity of effect magnitude
- Lovibond, Been, et al.
- 2003
(Show Context)
Citation Context ...discussion of Kruschke (2006), who argued against pure Bayesian learning models in favor of a particular heuristic treatment based on the effects of trial ordering in tasks such as backward blocking (=-=Lovibond et al., 2003-=-; Shanks, 1985; Wasserman and Berglan, 1998) and highlighting (Kruschke, 2003, 2006; Medin and Bettger, 1991). His ‘locally Bayesian’ model, while drawing on Bayesian methods, neither corresponds to e... |

14 |
Sensitivity to changes in base-rate information
- Medin, Bettger
- 1991
(Show Context)
Citation Context ...euristic treatment based on the effects of trial ordering in tasks such as backward blocking (Lovibond et al., 2003; Shanks, 1985; Wasserman and Berglan, 1998) and highlighting (Kruschke, 2003, 2006; =-=Medin and Bettger, 1991-=-). His ‘locally Bayesian’ model, while drawing on Bayesian methods, neither corresponds to exact inference nor is motivated or justified as an approximation to the ideal. While we agree with Kruschke ... |

12 | Statistical models of conditioning
- Dayan, Long
- 1998
(Show Context)
Citation Context ...with r being a binary variable reporting whether a patient developed an allergic reaction from eating them (e.g. r = R or 0). We briefly review a familiar statistical approach to such a problem (e.g. =-=Dayan and Long, 1998-=-; Griffiths and Yuille, this volume). This begins by assuming a space ofs19-Charter&Oaksford-Chap19 11/5/07 11:22 AM Page 429 hypotheses about how the data (D, a sequence of x→r pairs) were generated.... |

10 | Expected and unexpected uncertainty: ACh and NE
- Yu, Dayan
- 2003
(Show Context)
Citation Context ...experimental circumstances, or may be an approximation of convenience. In particular, contingencies often change more abruptly (as between experimental blocks). One way to formalize this possibility (=-=Yu and Dayan, 2003-=-, 2005) is to assume that in addition to smooth Gaussian diffusion, the weights are occasionally subject to a larger shock (e.g. another Gaussian with width sj >> sd ). However, the resulting model pr... |

9 | Causal learning in rats and humans: A minimal rational model - Waldmann, Cheng, et al. - 2008 |

7 | A behavioral model of rational choice. Models of Man - Simon - 1957 |

5 | Technical introduction: A primer on probabilistic inference - Griffiths, Yuille |

5 |
On the construction of a reduced rank square-root Kalman filter for efficient uncertainty propagation
- Treebushny, Madsen
(Show Context)
Citation Context ... the subject to carry less information between trials (because of the reduction in rank), and can also enable simplification of the matrix calculations for the subsequent update to the Kalman filter (=-=Treebushny and Madsen, 2005-=-). More precisely, we approximate the inverse posterior covariance after one trial, (St – κt xtS t) -1 , by retaining only those n basis vectors from its singular value decomposition that have the hig... |

4 | attention, and conditioning - Predictability |

1 | Touretzky (2003). Model uncertainty in classical conditioning - Courville, Daw, et al. |