Results 1 - 10
of
31
Toward a method of selecting among computational models of cognition
- Psychological Review
, 2002
"... The question of how one should decide among competing explanations of data is at the heart of the scientific enterprise. Computational models of cognition are increasingly being advanced as explanations of behavior. The success of this line of inquiry depends on the development of robust methods to ..."
Abstract
-
Cited by 41 (3 self)
- Add to MetaCart
The question of how one should decide among competing explanations of data is at the heart of the scientific enterprise. Computational models of cognition are increasingly being advanced as explanations of behavior. The success of this line of inquiry depends on the development of robust methods to guide the evaluation and selection of these models. This article introduces a method of selecting among mathematical models of cognition known as minimum description length, which provides an intuitive and theoretically well-grounded understanding of why one model should be chosen. A central but elusive concept in model selection, complexity, can also be derived with the method. The adequacy of the method is demonstrated in 3 areas of cognitive modeling: psychophysics, information integration, and categorization. How should one choose among competing theoretical explanations of data? This question is at the heart of the scientific enterprise, regardless of whether verbal models are being tested in an experimental setting or computational models are being evaluated in simulations. A number of criteria have been proposed to assist in this endeavor, summarized nicely by Jacobs and Grainger
Key Concepts in Model Selection: Performance and Generalizability
- Journal of Mathematical Psychology
, 2000
"... methods of model selection, and how do they work? Which methods perform better than others, and in what circumstances? These questions rest on a number of key concepts in a relatively underdeveloped field. The aim of this essay is to explain some background concepts, highlight some of the results in ..."
Abstract
-
Cited by 26 (11 self)
- Add to MetaCart
methods of model selection, and how do they work? Which methods perform better than others, and in what circumstances? These questions rest on a number of key concepts in a relatively underdeveloped field. The aim of this essay is to explain some background concepts, highlight some of the results in this special issue, and to add my own. The standard methods of model selection include classical hypothesis testing, maximum likelihood, Bayes method, minimum description length, cross-validation and Akaike’s information criterion. They all provide an implementation of Occam’s razor, in which parsimony or simplicity is balanced against goodness-of-fit. These methods primarily take account of the sampling errors in parameter estimation, although their relative success at this task depends on the circumstances. However, the aim of model selection should also include the ability of a model to generalize to predictions in a different domain. Errors of extrapolation, or generalization, are different from errors of parameter estimation. So, it seems that simplicity and parsimony may be an additional factor in managing these errors, in which case the standard methods of model selection are incomplete implementations of Occam’s razor. 1. WHAT IS MODEL SELECTION? William of Ockham (1285- 1347/49) will always be remembered for his famous postulations of Ockham’s razor (also spelled ‘Occam’), which states that entities are not to be multiplied beyond necessity. In a similar vein, Sir Isaac Newton’s first rule of hypothesizing instructs us that we are to admit no more causes of natural things than such as are both true and sufficient to explain their appearances. While they This paper is derived from a presentation at the Methods of Model Selection symposium at Indiana University
Traps in the route to models of memory and decision
- Psychonomic Bulletin & Review
, 2002
"... Over more than a half century of experience in research on learning, memory, and decision, I have come to believe that the most substantial and enduring advances have not been in the accumulation of empirical facts or the construction of models, but in the production of fruitful interactions between ..."
Abstract
-
Cited by 17 (2 self)
- Add to MetaCart
Over more than a half century of experience in research on learning, memory, and decision, I have come to believe that the most substantial and enduring advances have not been in the accumulation of empirical facts or the construction of models, but in the production of fruitful interactions between models and experimental research. Most experimental facts require continual reinterpretation and most models drop by the wayside like autumn leaves, but the results of interactions between models and experiments constitute most of our generalizable knowledge. Success in the interactive research effort depends not only on clearly formulated models and well-conducted experiments, but, just as importantly, on sound interpretations of the results of applying the models to the experiments. This interpretive phase of the effort is in some respects the most difficult, and I take as my main task in this article an account of some of the issues that have to be resolved and some of the traps that have to be avoided in order for the process to run to a successful conclusion. As a preliminary, I turn to a review of the basic concept of applying a model to data as it has evolved since its first rudimentary instantiation in the literature of memory and decision more than a century ago. Applying Models to Experiments Details of techniques for fitting curves, or, more broadly, formal models, whether mathematical or computer imple-This article presents in substance the author’s Governing Board Keynote
How to Fit a Response Time Distribution
"... Among the most valuable tools in behavioral science is statistically fitting mathematical models of cognition to data, response time distributions in particular. However, techniques for fitting distributions vary widely and little is known about the efficacy of different techniques. In this article, ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Among the most valuable tools in behavioral science is statistically fitting mathematical models of cognition to data, response time distributions in particular. However, techniques for fitting distributions vary widely and little is known about the efficacy of different techniques. In this article, we assessed several fitting techniques by simulating six widely cited models of response time and using the fitting procedures to recover model parameters. The techniques include the maximization of likelihood and least-squares fits of the theoretical distributions to different empirical estimates of the simulated distributions. A running example was used to illustrate the different estimation and fitting procedures. The simulation studies revealed that empirical density estimates are biased even for very large sample sizes. Some fitting techniques yielded more accurate and less variable parameter estimates than others. Methods that involved least-squares fits to density estimates generally yielded very poor parameter estimates. How to Fit a Response Time Distribution The importance of considering the entire response time (RT) distribution in testing formal models of cognition is now widely appreciated. Fitting a model to mean RT alone can mask important details of the data that examination of the entire distribution would reveal, such as the behavior of fast and slow responses across the conditions of an experiment (e.g., Heathcote, Popiel & Mewhort, 1991), the extent of facilitation between perceptual channels (Miller, 1982), and the effects of practice on RT quantiles (Logan, 1992). Techniques for testing hypotheses based on the RT distribution have been developed (Townsend, 1990). In addition, the RT distribution provides an important meeting ground between theory and da...
Bayesian Statistics
- in WWW', Computing Science and Statistics
, 1989
"... ∗ Signatures are on file in the Graduate School. This dissertation presents two topics from opposite disciplines: one is from a parametric realm and the other is based on nonparametric methods. The first topic is a jackknife maximum likelihood approach to statistical model selection and the second o ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
∗ Signatures are on file in the Graduate School. This dissertation presents two topics from opposite disciplines: one is from a parametric realm and the other is based on nonparametric methods. The first topic is a jackknife maximum likelihood approach to statistical model selection and the second one is a convex hull peeling depth approach to nonparametric massive multivariate data analysis. The second topic includes simulations and applications on massive astronomical data. First, we present a model selection criterion, minimizing the Kullback-Leibler distance by using the jackknife method. Various model selection methods have been developed to choose a model of minimum Kullback-Liebler distance to the true model, such as Akaike information criterion (AIC), Bayesian information criterion (BIC), Minimum description length (MDL), and Bootstrap information criterion. Likewise, the jackknife method chooses a model of minimum Kullback-Leibler distance through bias reduction. This bias, which is inevitable in model
Simplicity versus likelihood in visual perception: from surprisals to precisals
- Psychological Bulletin
, 2000
"... The likelihood principle states that the visual system prefers the most likely interpretation of a stimulus, whereas the simplicity principle states that it prefers the most simple interpretation. This study investi-gates how close these seemingly very different principles are by combining findings ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
The likelihood principle states that the visual system prefers the most likely interpretation of a stimulus, whereas the simplicity principle states that it prefers the most simple interpretation. This study investi-gates how close these seemingly very different principles are by combining findings from classical, algorithmic, and structural information theory. It is argued that, in visual perception, the two principles are perhaps very different with respect to the viewpoint-independent aspects of perception but probably very close with respect to the viewpoint-dependent aspects which, moreover, seem decisive in everyday perception. This implies that either principle may have guided the evolution of visual systems and that the simplicity paradigm may provide perception models with the necessary quantitative specifications of the often plausible but also intuitive ideas provided by the likelihood paradigm. In visual perception research, an ongoing debate concerns the question of whether the likelihood principle (Von Helmholtz, 1909/1962) or the simplicity principle (Hochberg & McAlister, 1953) provides the best explanation of the human interpretation of visual stimuli. The phenomenon to be explained is, more specifi-cally, that human subjects usually show a clear preference for only
Model Selection by Normalized Maximum Likelihood
, 2005
"... The Minimum Description Length (MDL) principle is an information theoretic approach to inductive inference that originated in algorithmic coding theory. In this approach, data are viewed as codes to be compressed by the model. From this perspective, models are compared on their ability to compress a ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
The Minimum Description Length (MDL) principle is an information theoretic approach to inductive inference that originated in algorithmic coding theory. In this approach, data are viewed as codes to be compressed by the model. From this perspective, models are compared on their ability to compress a data set by extracting useful information in the data apart from random noise. The goal of model selection is to identify the model, from a set of candidate models, that permits the shortest description length (code) of the data. Since Rissanen originally formalized the problem using the crude ‘two-part code ’ MDL method in the 1970s, many significant strides have been made, especially in the 1990s, with the culmination of the development of the refined ‘universal code’ MDL method, dubbed Normalized Maximum Likelihood (NML). It represents an elegant solution to the model selection problem. The present paper provides a tutorial review on these latest developments with a special focus on NML. An application example of NML in cognitive modeling is also provided.
Incremental planning in sequence production
- Psychological Review
, 2003
"... People produce long sequences such as speech and music with incremental planning: mental preparation of a subset of sequence events. The authors model in music performance the sequence events that can be retrieved and prepared during production. Events are encoded in terms of their serial order and ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
People produce long sequences such as speech and music with incremental planning: mental preparation of a subset of sequence events. The authors model in music performance the sequence events that can be retrieved and prepared during production. Events are encoded in terms of their serial order and timing relative to other events in a planning increment, a contextually determined distribution of event activations. Planning is facilitated by events ’ metrical similarity and serial/temporal proximity and by developmental changes in short-term memory. The model’s predictions of larger planning increments as production rate decreases and as producers ’ age–experience increases are confirmed in serial-ordering errors produced by adults and children. Incremental planning is considered as a general retrieval constraint in serially ordered behaviors. When people produce long, complex sequences such as speech and music, they must plan what event to produce next (the serialorder problem) and when to produce it (the timing problem). Bernstein (1967) and Lashley (1951) both pointed to music as a quintessential example of serial-ordering abilities because of its complexity, length, and temporal properties. Although musical
Within-category discontinuity interacts with verbal rule complexity in perceptual category learning
- Journal of Experimental Psychology: Learning, Memory, and Cognition
, 2007
"... A test of the predicted interaction between within-category discontinuity and verbal rule complexity on information-integration and rule-based category learning was conducted. Within-category discontinuity adversely affected information-integration category learning but not rule-based category learn ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
A test of the predicted interaction between within-category discontinuity and verbal rule complexity on information-integration and rule-based category learning was conducted. Within-category discontinuity adversely affected information-integration category learning but not rule-based category learning. Modelbased analyses suggested that some information-integration participants improved performance by recruiting more “units ” in the discontinuous condition. Verbal rule complexity adversely affected rule-based category learning but not information-integration category learning. Model-based analyses suggested that the rule based effect was on both decision criterion learning and variability in decision criterion placement. These results suggest that within-category discontinuity and decision rule complexity differentially impact information-integration and rule-based category learning and provide information
The Neural Correlates of Problem States: Testing fMRI Predictions of a Computational Model of Multitasking
, 2010
"... Background: It has been shown that people can only maintain one problem state, or intermediate mental representation, at a time. When more than one problem state is required, for example in multitasking, performance decreases considerably. This effect has been explained in terms of a problem state b ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
Background: It has been shown that people can only maintain one problem state, or intermediate mental representation, at a time. When more than one problem state is required, for example in multitasking, performance decreases considerably. This effect has been explained in terms of a problem state bottleneck. Methodology: In the current study we use the complimentary methodologies of computational cognitive modeling and neuroimaging to investigate the neural correlates of this problem state bottleneck. In particular, an existing computational cognitive model was used to generate a priori fMRI predictions for a multitasking experiment in which the problem state bottleneck plays a major role. Hemodynamic responses were predicted for five brain regions, corresponding to five cognitive resources in the model. Most importantly, we predicted the intraparietal sulcus to show a strong effect of the problem state manipulations. Conclusions: Some of the predictions were confirmed by a subsequent fMRI experiment, while others were not matched by the data. The experiment supported the hypothesis that the problem state bottleneck is a plausible cause of the

