Results 21  30
of
56
An empirical study of minimum description length model selection with infinite parametric complexity
 JOURNAL OF MATHEMATICAL PSYCHOLOGY
, 2006
"... Parametric complexity is a central concept in Minimum Description Length (MDL) model selection. In practice it often turns out to be infinite, even for quite simple models such as the Poisson and Geometric families. In such cases, MDL model selection as based on NML and Bayesian inference based on J ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
Parametric complexity is a central concept in Minimum Description Length (MDL) model selection. In practice it often turns out to be infinite, even for quite simple models such as the Poisson and Geometric families. In such cases, MDL model selection as based on NML and Bayesian inference based on Jeffreys ’ prior can not be used. Several ways to resolve this problem have been proposed. We conduct experiments to compare and evaluate their behaviour on small sample sizes. We find interestingly poor behaviour for the plugin predictive code; a restricted NML model performs quite well but it is questionable if the results validate its theoretical motivation. A Bayesian marginal distribution with Jeffreys’ prior can still be used if one sacrifices the first observation to make a proper posterior; this approach turns out to be most dependable.
Laws and limits of econometrics
 ECONOMIC JOURNAL
, 2003
"... We start by discussing some general weaknesses and limitations of the econometric approach. A template from sociology is used to formulate six laws that characterize mainstream activities of econometrics and the scientific limits of those activities. Next, we discuss some proximity theorems that qua ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
We start by discussing some general weaknesses and limitations of the econometric approach. A template from sociology is used to formulate six laws that characterize mainstream activities of econometrics and the scientific limits of those activities. Next, we discuss some proximity theorems that quantify by means of explicit bounds how close we can get to the generating mechanism of the data and the optimal forecasts of next period observations using a finite number of observations. The magnitude of the bound depends on the characteristics of the model and the trajectory of the observed data. The results show that trends are more elusive to model than stationary processes in the sense that the proximity bounds are larger. By contrast, the bounds are of smaller order for models that are unidentified or nearly unidentified, so that lack or near lack of identification may not be as fatal to the use of a model in practice as some recent results on inference suggest. Finally, we look at one possible future of econometrics that involves the use of advanced econometric methods interactively by way of a web browser. With these methods users may access a suite of econometric methods and data sets online. They may also upload data to remote servers and by simple web browser selections initiate the implementation of advanced econometric software algorithms, returning the results online and by file and graphics downloads.
Iterated Logarithmic Expansions of the Pathwise Code Lengths for Exponential Families
 IEEE Transactions on Information Theory
, 1999
"... Rissanen's Minimum Description Length (MDL) principle is a statistical modeling principle motivated by coding theory. For exponential families we obtain pathwise expansions, to the constant order, of the predictive and mixture code lengths used in MDL. The results are useful for understanding differ ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
Rissanen's Minimum Description Length (MDL) principle is a statistical modeling principle motivated by coding theory. For exponential families we obtain pathwise expansions, to the constant order, of the predictive and mixture code lengths used in MDL. The results are useful for understanding different MDL forms.
Learning Bayesian belief networks with neural network estimators
 In Neural Information Processing Systems 9
, 1997
"... In this paper we propose a method for learning Bayesian belief networks from data. The method uses artificial neural networks as probability estimators, thus avoiding the need for making prior assumptions on the nature of the probability distributions governing the relationships among the participat ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
In this paper we propose a method for learning Bayesian belief networks from data. The method uses artificial neural networks as probability estimators, thus avoiding the need for making prior assumptions on the nature of the probability distributions governing the relationships among the participating variables. This new method has the potential for being applied to domains containing both discrete and continuous variables arbitrarily distributed. We compare the learning performance of this new method with the performance of the method proposed by Cooper and Herskovits in [10]. The experimental results show that, although the learning scheme based on the use of ANN estimators is slower, the learning accuracy of the two methods is comparable. y To appear in Advances in Neural Information Processing Systems, 1996. 1 Introduction Bayesian belief networks (BBN), often referred to as probabilistic networks, are a powerful formalism for representing and reasoning under uncertainty. This...
Rooij. Asymptotic logloss of prequential maximum likelihood codes
 In Conference on Learning Theory (COLT 2005
, 2005
"... We analyze the DawidRissanen prequential maximum likelihood codes relative to oneparameter exponential family models M. If data are i.i.d. according to an (essentially) arbitrary P, then the redundancy grows at rate 1 2c lnn. We show that c = σ2 1/σ2 2, where σ2 1 is the variance of P, and σ2 2 is ..."
Abstract

Cited by 6 (4 self)
 Add to MetaCart
We analyze the DawidRissanen prequential maximum likelihood codes relative to oneparameter exponential family models M. If data are i.i.d. according to an (essentially) arbitrary P, then the redundancy grows at rate 1 2c lnn. We show that c = σ2 1/σ2 2, where σ2 1 is the variance of P, and σ2 2 is the variance of the distribution M ∗ ∈ M that is closest to P in KL divergence. This shows that prequential codes behave quite differently from other important universal codes such as the 2part MDL, Shtarkov and Bayes codes, for which c = 1. This behavior is undesirable in an MDL model selection setting. 1
From universal laws of cognition to specific cognitive models
 34 – 215535 Deliverable 1.1.1
, 2008
"... The remarkable successes of the physical sciences have been built on highly general quantitative laws, which serve as the basis for understanding an enormous variety of specific physical systems. How far is it possible to construct universal principles in the cognitive sciences, in terms of which sp ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
The remarkable successes of the physical sciences have been built on highly general quantitative laws, which serve as the basis for understanding an enormous variety of specific physical systems. How far is it possible to construct universal principles in the cognitive sciences, in terms of which specific aspects of perception, memory, or decision making might be modelled? Following Shepard (e.g., 1987), it is argued that some universal principles may be attainable in cognitive science. Here we propose two examples: The simplicity principle (which states that the cognitive system prefers patterns that provide simpler explanations of available data); and the scaleinvariance principle, which states that many cognitive phenomena are independent of the scale of relevant underlying physical variables, such as time, space, luminance, or sound pressure. We illustrate how principles may be combined to explain specific cognitive processes by using these principles to derive SIMPLE, a formal model of memory for serial order (Brown, Neath & Chater, in press), and briefly mention some extensions to models of identification and categorization. We also consider the scope and limitations of universal laws in cognitive science.
An empirical study of MDL model selection with infinite parametric complexity
 J. Mathematical Psychology
, 2006
"... Parametric complexity is a central concept in MDL model selection. In practice it often turns out to be infinite, even for quite simple models such as the Poisson and Geometric families. In such cases, MDL model selection as based on NML and Bayesian inference based on Jeffreys ’ prior can not be us ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
Parametric complexity is a central concept in MDL model selection. In practice it often turns out to be infinite, even for quite simple models such as the Poisson and Geometric families. In such cases, MDL model selection as based on NML and Bayesian inference based on Jeffreys ’ prior can not be used. Several ways to resolve this problem have been proposed. We conduct experiments to compare and evaluate their behaviour on small sample sizes. We find interestingly poor behaviour for the plugin predictive code; a restricted NML model performs quite well but it is questionable if the results validate its theoretical motivation. The Bayesian model with the improper Jeffreys ’ prior is the most dependable. 1
Stochastic Complexity and Its Applications
 In Workshop on Model Uncertainty and Model Robustness. Online
"... Introduction One can make a strong case for that an unsurpassed model of any set of observations, generated by some physical machinery, is provided by the shortest program for a universal computer with which the data can be reproduced. Indeed, such a program must take advantage of all the constrain ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Introduction One can make a strong case for that an unsurpassed model of any set of observations, generated by some physical machinery, is provided by the shortest program for a universal computer with which the data can be reproduced. Indeed, such a program must take advantage of all the constraints that the data have, and hence it will capture the relevant properties of the machinery  provided of course that the data set is large enough to reflect them. Unfortunately, such a program or even its length, the celebrated Kolmogorov complexity, cannot be found by algorithmic means, which has the devastating implication that, even though we can estimate it from above, we cannot assess the goodness of the estimate. And this puts an end to the dreams of basing inductive inference on the Kolmogorov complexity. The problem of noncomputability can be overcome, while retaining the idea of measuring the strength of constraints by code length, if we select a smaller class of `codes' as `
Statistical Computing
 ACM Transaction
, 2003
"... this article description length is exchangeable with code length so the question really is: why code length in statistics? To answer that, we need to answer first What is a code? Given a finite ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
this article description length is exchangeable with code length so the question really is: why code length in statistics? To answer that, we need to answer first What is a code? Given a finite
Some Bayesian perspectives on statistical modelling
, 1988
"... I would like to thank my supervisor, Professor A. F. M. Smith, for all his advice and encourage ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
I would like to thank my supervisor, Professor A. F. M. Smith, for all his advice and encourage