Results 1  10
of
32
Toward a method of selecting among computational models of cognition
 Psychological Review
, 2002
"... The question of how one should decide among competing explanations of data is at the heart of the scientific enterprise. Computational models of cognition are increasingly being advanced as explanations of behavior. The success of this line of inquiry depends on the development of robust methods to ..."
Abstract

Cited by 80 (4 self)
 Add to MetaCart
The question of how one should decide among competing explanations of data is at the heart of the scientific enterprise. Computational models of cognition are increasingly being advanced as explanations of behavior. The success of this line of inquiry depends on the development of robust methods to guide the evaluation and selection of these models. This article introduces a method of selecting among mathematical models of cognition known as minimum description length, which provides an intuitive and theoretically wellgrounded understanding of why one model should be chosen. A central but elusive concept in model selection, complexity, can also be derived with the method. The adequacy of the method is demonstrated in 3 areas of cognitive modeling: psychophysics, information integration, and categorization. How should one choose among competing theoretical explanations of data? This question is at the heart of the scientific enterprise, regardless of whether verbal models are being tested in an experimental setting or computational models are being evaluated in simulations. A number of criteria have been proposed to assist in this endeavor, summarized nicely by Jacobs and Grainger
Akaike’s information criterion and recent developments in information complexity
 Journal of Mathematical Psychology
"... criterion (AIC). Then, we present some recent developments on a new entropic or information complexity (ICOMP) criterion of Bozdogan (1988a, 1988b, 1990, 1994d, 1996, 1998a, 1998b) for model selection. A rationale for ICOMP as a model selection criterion is that it combines a badnessoffit term (su ..."
Abstract

Cited by 57 (5 self)
 Add to MetaCart
criterion (AIC). Then, we present some recent developments on a new entropic or information complexity (ICOMP) criterion of Bozdogan (1988a, 1988b, 1990, 1994d, 1996, 1998a, 1998b) for model selection. A rationale for ICOMP as a model selection criterion is that it combines a badnessoffit term (such as minus twice the maximum log likelihood) with a measure of complexity of a model differently than AIC, or its variants, by taking into account the interdependencies of the parameter estimates as well as the dependencies of the model residuals. We operationalize the general form of ICOMP based on the quantification of the concept of overall model complexity in terms of the estimated inverseFisher information matrix. This approach results in an approximation to the sum of two KullbackLeibler distances. Using the correlational form of the complexity, we further provide yet another form of ICOMP to take into account the interdependencies (i.e., correlations) among the parameter estimates of the model. Later, we illustrate the practical utility and the importance of this new model selection criterion by providing several
Unsupervised Learning Using MML
 IN MACHINE LEARNING: PROCEEDINGS OF THE THIRTEENTH INTERNATIONAL CONFERENCE (ICML 96
, 1996
"... This paper discusses the unsupervised learning problem. An important part of the unsupervised learning problem is determining the number of constituent groups (components or classes) which best describes some data. We apply the Minimum Message Length (MML) criterion to the unsupervised learning prob ..."
Abstract

Cited by 41 (5 self)
 Add to MetaCart
This paper discusses the unsupervised learning problem. An important part of the unsupervised learning problem is determining the number of constituent groups (components or classes) which best describes some data. We apply the Minimum Message Length (MML) criterion to the unsupervised learning problem, modifying an earlier such MML application. We give an empirical comparison of criteria prominent in the literature for estimating the number of components in a data set. We conclude that the Minimum Message Length criterion performs better than the alternatives on the data considered here for unsupervised learning tasks.
The Great Equalizer? Consumer Choice Behavior at Internet Shopbots
 SLOAN SCHOOL OF MANAGEMENT, MIT
, 2000
"... Our research empirically analyzes consumer behavior at Internet shopbots — sites that allow consumers to make “oneclick ” price comparisons for product offerings from multiple retailers. By allowing researchers to observe exactly what information the consumer is shown and their search behavior in r ..."
Abstract

Cited by 32 (0 self)
 Add to MetaCart
Our research empirically analyzes consumer behavior at Internet shopbots — sites that allow consumers to make “oneclick ” price comparisons for product offerings from multiple retailers. By allowing researchers to observe exactly what information the consumer is shown and their search behavior in response to this information, shopbot data has unique strengths for analyzing consumer behavior. Furthermore, the method in which the data is displayed to consumers lends itself to a utilitybased evaluation process, consistent with econometric analysis techniques. While price is an important determinant of customer choice, we find that, even among shopbot consumers, branded retailers and retailers a consumer visited previously hold significant price advantages in headtohead price comparisons. Further, customers are very sensitive to how the total price is allocated among the item price, the shipping cost, and tax, and are also quite sensitive to the ordinal ranking of retailer offerings with respect to price. We also find that consumers use brand as a proxy for a retailer’s credibility with regard to noncontractible aspects of the product bundle such as shipping time. In each case our models accurately predict consumer behavior out of sample, suggesting
Information Criteria for Residual Generation and Fault Detection and Isolation
, 1996
"... Using an information point of view, we discuss deterministic versus stochastic tools for residual generation and evaluation for fault detection and isolation (FDI) in linear time invariant (LTI) statespace systems. In both types of approaches to offline FDI, residual generation can be viewed as t ..."
Abstract

Cited by 19 (7 self)
 Add to MetaCart
Using an information point of view, we discuss deterministic versus stochastic tools for residual generation and evaluation for fault detection and isolation (FDI) in linear time invariant (LTI) statespace systems. In both types of approaches to offline FDI, residual generation can be viewed as the design of a linear transformation of a Gaussian vector (the finitewindow inputadjusted observations) . Several statistical isolation methods are revisited, using both a linear transform formulation and the information content of the corresponding residuals. We formally state several multiple fault cases, with or without causality assumptions, and discuss an optimality criterion for the most general one. New information criteria are proposed for investigating the residual optimization problem.
Comparing Bayesian Model Class Selection Criteria by Discrete Finite Mixtures
 Information, Statistics and Induction in Science, pages 364374, Proceedings of the ISIS'96 Conference
, 1996
"... : We investigate the problem of computing the posterior probability of a model class, given a data sample and a prior distribution for possible parameter settings. By a model class we mean a group of models which all share the same parametric form. In general this posterior may be very hard to compu ..."
Abstract

Cited by 10 (5 self)
 Add to MetaCart
: We investigate the problem of computing the posterior probability of a model class, given a data sample and a prior distribution for possible parameter settings. By a model class we mean a group of models which all share the same parametric form. In general this posterior may be very hard to compute for highdimensional parameter spaces, which is usually the case with realworld applications. In the literature several methods for computing the posterior approximately have been proposed, but the quality of the approximations may depend heavily on the size of the available data sample. In this work we are interested in testing how well the approximative methods perform in realworld problem domains. In order to conduct such a study, we have chosen the model family of finite mixture distributions. With certain assumptions, we are able to derive the model class posterior analytically for this model family. We report a series of model class selection experiments on realworld data sets, w...
Finding Overlapping Components with MML
 Statistics and Computing
, 2000
"... We use minimum message length (MML) estimation for mixture modelling. MML estimates are derived to choose the number of components in the mixture model to best describe the data and to estimate the parameters of the component densities for Gaussian mixture models. An empirical comparison of criteria ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
We use minimum message length (MML) estimation for mixture modelling. MML estimates are derived to choose the number of components in the mixture model to best describe the data and to estimate the parameters of the component densities for Gaussian mixture models. An empirical comparison of criteria prominent in the literature for estimating the number of components in a data set is performed.
A method to add gaussian mixture models
, 2004
"... This version is made available in accordance with publisher policies. Please cite only the published version using the reference above. ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
This version is made available in accordance with publisher policies. Please cite only the published version using the reference above.
ASSESSING THE NUMBER OF COMPONENTS IN MIXTURE MODELS: A REVIEW. ABSTRACT
"... Despite the widespread application of finite mixture models, the decision of how many classes are required to adequately represent the data is, according to many authors, an important, but unsolved issue. This work aims to review, describe and organize the available approaches designed to help the s ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Despite the widespread application of finite mixture models, the decision of how many classes are required to adequately represent the data is, according to many authors, an important, but unsolved issue. This work aims to review, describe and organize the available approaches designed to help the selection of the adequate number of mixture components (including Monte Carlo test procedures, information criteria and classificationbased criteria); we also provide some published simulation results about their relative performance, with the purpose of identifying the scenarios where each criterion is more effective (adequate). Key words: Finite mixture; number of mixture components; information criteria; simulation studies.
On the Accuracy of Stochastic Complexity Approximations
 IN A. GAMMERMAN (ED.), CAUSAL
, 1997
"... Stochastic complexity of a data set is defined as the shortest possible code length for the data obtainable by using some fixed set of models. This measure is of great theoretical and practical importance as a tool for tasks such as determining model complexity, or performing predictive inference. U ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
Stochastic complexity of a data set is defined as the shortest possible code length for the data obtainable by using some fixed set of models. This measure is of great theoretical and practical importance as a tool for tasks such as determining model complexity, or performing predictive inference. Unfortunately for cases where the data has missing information, computing the stochastic complexity requires marginalizing (integrating) over the missing data, which results even in the discrete data case to computing a sum with an exponential number of terms. Therefore in most cases the stochastic complexity measure has to be approximated. In this paper we will investigate empirically the performance of some of the most common stochastic complexity approximations in an attempt to understand their small sample behavior in the incomplete data framework. In earlier empirical evaluations the problem of not knowing the actual stochastic complexity for incomplete data was circumvented either by us...