Results 1 
3 of
3
An empirical study of minimum description length model selection with infinite parametric complexity
 JOURNAL OF MATHEMATICAL PSYCHOLOGY
, 2006
"... Parametric complexity is a central concept in Minimum Description Length (MDL) model selection. In practice it often turns out to be infinite, even for quite simple models such as the Poisson and Geometric families. In such cases, MDL model selection as based on NML and Bayesian inference based on J ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
Parametric complexity is a central concept in Minimum Description Length (MDL) model selection. In practice it often turns out to be infinite, even for quite simple models such as the Poisson and Geometric families. In such cases, MDL model selection as based on NML and Bayesian inference based on Jeffreys ’ prior can not be used. Several ways to resolve this problem have been proposed. We conduct experiments to compare and evaluate their behaviour on small sample sizes. We find interestingly poor behaviour for the plugin predictive code; a restricted NML model performs quite well but it is questionable if the results validate its theoretical motivation. A Bayesian marginal distribution with Jeffreys’ prior can still be used if one sacrifices the first observation to make a proper posterior; this approach turns out to be most dependable.
Classifier Learning with Supervised Marginal Likelihood
"... It has been argued that in supervised classification tasks it may be more sensible to perform model selection with respect to a more focused model selection score, like the supervised (conditional) marginal likelihood, than with respect to the standard unsupervised marginal likelihood criterion ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
It has been argued that in supervised classification tasks it may be more sensible to perform model selection with respect to a more focused model selection score, like the supervised (conditional) marginal likelihood, than with respect to the standard unsupervised marginal likelihood criterion. However, for most Bayesian network models, computing the supervised marginal likelihood score takes exponential time with respect to the amount of observed data. In this paper, we consider diagnostic Bayesian network classifiers where the significant model parameters represent conditional distributions for the class variable, given the values of the predictor variables, in which case the supervised marginal likelihood can be computed in linear time with respect to the data. As the number of model parameters grows in this case exponentially with respect to the number of predictors, we focus on simple diagnostic models where the number of relevant predictors is small, and suggest two approaches for applying this type of models in classification. The first approach is based on mixtures of simple diagnostic models, while in the second approach we apply the small predictor sets of the simple diagnostic models for augmenting the Naive Bayes classifier.
Rooij. Asymptotic logloss of prequential maximum likelihood codes
 In Conference on Learning Theory (COLT 2005
, 2005
"... We analyze the DawidRissanen prequential maximum likelihood codes relative to oneparameter exponential family models M. If data are i.i.d. according to an (essentially) arbitrary P, then the redundancy grows at rate 1 2c lnn. We show that c = σ2 1/σ2 2, where σ2 1 is the variance of P, and σ2 2 is ..."
Abstract

Cited by 6 (4 self)
 Add to MetaCart
We analyze the DawidRissanen prequential maximum likelihood codes relative to oneparameter exponential family models M. If data are i.i.d. according to an (essentially) arbitrary P, then the redundancy grows at rate 1 2c lnn. We show that c = σ2 1/σ2 2, where σ2 1 is the variance of P, and σ2 2 is the variance of the distribution M ∗ ∈ M that is closest to P in KL divergence. This shows that prequential codes behave quite differently from other important universal codes such as the 2part MDL, Shtarkov and Bayes codes, for which c = 1. This behavior is undesirable in an MDL model selection setting. 1