Results 11  20
of
81
Discussion on Kolmogorov Complexity and Statistical Analysis
, 1999
"... equality (1) could be explained as follows: any object x # A has a twopart description. The first part is (a description of a) program p. The second part is the number of x in the enumeration of A (the element that appears first has number 1, the next element has number 2, etc.). The first part r ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
equality (1) could be explained as follows: any object x # A has a twopart description. The first part is (a description of a) program p. The second part is the number of x in the enumeration of A (the element that appears first has number 1, the next element has number 2, etc.). The first part requires K ( p) bits. The second part requires at most log 2  A bits. (Additional O(log n) bits are needed to form a pair; we omit the details.) We are interested in `efficient' twopart descriptions for which the inequality (1) is close to equality. For any string x there are many efficient descriptions. Here are two `extreme' examples: (a) The set A consists of x only: A ={x}; the program p that enumerates
Model Selection by Normalized Maximum Likelihood
, 2005
"... The Minimum Description Length (MDL) principle is an information theoretic approach to inductive inference that originated in algorithmic coding theory. In this approach, data are viewed as codes to be compressed by the model. From this perspective, models are compared on their ability to compress a ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
The Minimum Description Length (MDL) principle is an information theoretic approach to inductive inference that originated in algorithmic coding theory. In this approach, data are viewed as codes to be compressed by the model. From this perspective, models are compared on their ability to compress a data set by extracting useful information in the data apart from random noise. The goal of model selection is to identify the model, from a set of candidate models, that permits the shortest description length (code) of the data. Since Rissanen originally formalized the problem using the crude ‘twopart code ’ MDL method in the 1970s, many significant strides have been made, especially in the 1990s, with the culmination of the development of the refined ‘universal code’ MDL method, dubbed Normalized Maximum Likelihood (NML). It represents an elegant solution to the model selection problem. The present paper provides a tutorial review on these latest developments with a special focus on NML. An application example of NML in cognitive modeling is also provided.
Minimum Message Length Autoregressive Model Order Selection
 International Conference on Intelligent Sensing and Information Processing (ICISIP
, 2004
"... We derive a Minimum Message Length (MML) estimator for stationary and nonstationary autoregressive models using the Wallace and Freeman (1987) approximation. The MML estimator’s model selection performance is empirically compared with AIC, AICc, BIC and HQ in a Monte Carlo experiment by uniformly sa ..."
Abstract

Cited by 10 (9 self)
 Add to MetaCart
We derive a Minimum Message Length (MML) estimator for stationary and nonstationary autoregressive models using the Wallace and Freeman (1987) approximation. The MML estimator’s model selection performance is empirically compared with AIC, AICc, BIC and HQ in a Monte Carlo experiment by uniformly sampling from the autoregressive stationarity region. Generally applicable, uniform priors are used on the coefficients, model order and log σ 2 for the MML estimator. The experimental results show the MML estimator to have the best overall average mean squared prediction error and best ability to choose the true model order.
Univariate Polynomial Inference by Monte Carlo Message Length Approximation
 in Int. Conf. Machine Learning
, 2002
"... We apply the Message from Monte Carlo (MMC) algorithm to inference of univariate polynomials. MMC is an algorithm for point estimation from a Bayesian posterior sample. ..."
Abstract

Cited by 9 (5 self)
 Add to MetaCart
We apply the Message from Monte Carlo (MMC) algorithm to inference of univariate polynomials. MMC is an algorithm for point estimation from a Bayesian posterior sample.
Advances on BYY Harmony Learning: Information Theoretic Perspective, Generalized Projection Geometry, and Independent Factor Autodetermination
, 2004
"... The nature of Bayesian YingYang harmony learning is reexamined from an information theoretic perspective. Not only its ability for model selection and regularization is explained with new insights, but also discussions are made on its relations and differences from the studies of minimum descripti ..."
Abstract

Cited by 9 (7 self)
 Add to MetaCart
The nature of Bayesian YingYang harmony learning is reexamined from an information theoretic perspective. Not only its ability for model selection and regularization is explained with new insights, but also discussions are made on its relations and differences from the studies of minimum description length (MDL), Bayesian approach, the bitback based MDL, Akaike information criterion (AIC), maximum likelihood, information geometry, Helmholtz machines, and variational approximation. Moreover, a generalized projection geometry is introduced for further understanding such a new mechanism. Furthermore, new algorithms are also developed for implementing Gaussian factor analysis (FA) and nonGaussian factor analysis (NFA) such that selecting appropriate factors is automatically made during parameter learning.
MML Inference of Oblique Decision Trees
 In Lecture Notes in Artificial Intelligence (LNAI) 3339 (Springer), Proc. 17th Australian Joint Conf. on AI
, 2004
"... Abstract. We propose a multivariate decision tree inference scheme by using the minimum message length (MML) principle (Wallace and Boulton, 1968; Wallace and Dowe, 1999). The scheme uses MML coding as an objective (goodnessoffit) function on model selection and searches with a simple evolution st ..."
Abstract

Cited by 9 (5 self)
 Add to MetaCart
Abstract. We propose a multivariate decision tree inference scheme by using the minimum message length (MML) principle (Wallace and Boulton, 1968; Wallace and Dowe, 1999). The scheme uses MML coding as an objective (goodnessoffit) function on model selection and searches with a simple evolution strategy. We test our multivariate tree inference scheme on UCI machine learning repository data sets and compare with the decision tree programs C4.5 and C5. The preliminary results show that on average and on most datasets, MML oblique trees clearly perform better than both C4.5 and C5 on both “right”/“wrong ” accuracy and probabilistic prediction and with smaller trees, i.e., less leaf nodes. 1
Causal models as minimal descriptions of multivariate systems. http://parallel.vub.ac.be/∼jan
, 2006
"... ABSTRACT. By applying the minimality principle for model selection, one should seek the model that describes the data by a code of minimal length. Learning is viewed as data compression that exploits the regularities or qualitative properties found in the data, in order to build a model containing t ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
ABSTRACT. By applying the minimality principle for model selection, one should seek the model that describes the data by a code of minimal length. Learning is viewed as data compression that exploits the regularities or qualitative properties found in the data, in order to build a model containing the meaningful information. The theory of causal modeling can be interpreted by this approach. The regularities are the conditional independencies reducing a factorization and the vstructure regularities. In the absence of other regularities, a causal model is faithful and offers a minimal description of a probability distribution. The causal interpretation of a faithful Bayesian network is motivated by the canonical representation it offers and faithfulness. A causal model decomposes the distribution into independent atomic blocks and is able to explain all qualitative properties found in the data. The existence of faithful models depends on the additional regularities in the data. Local structure of the conditional probability distributions allow further compression of the model. Interfering regularities, however, generate conditional independencies that do not follow from the Markov condition. These regularities has to be incorporated into an augmented model for which the inference algorithms are adapted to take into account their influences. But for other regularities, like patterns in a string, causality does not offer a modeling framework that leads to a minimal description. 1
Temporal BYY Encoding, Markovian State Spaces, and Space Dimension Determination
, 2004
"... As a complementary to those temporal coding approaches of the current major stream, this paper aims at the Markovian state space temporal models from the perspective of the temporal Bayesian YingYang (BYY) learning with both new insights and new results on not only the discrete state featured Hidde ..."
Abstract

Cited by 7 (7 self)
 Add to MetaCart
As a complementary to those temporal coding approaches of the current major stream, this paper aims at the Markovian state space temporal models from the perspective of the temporal Bayesian YingYang (BYY) learning with both new insights and new results on not only the discrete state featured Hidden Markov model and extensions but also the continuous state featured linear state spaces and extensions, especially with a new learning mechanism that makes selection of the state number or the dimension of state space either automatically during adaptive learning or subsequently after learning via model selection criteria obtained from this mechanism. Experiments are demonstrated to show how the proposed approach works.
INFORMATION MEASURE FOR MODULARITY IN ENGINEERING DESIGN
, 2004
"... Modular structures are common in complex natural and artificial systems, and the terms “modular” or “modularity” are used throughout the engineering design literature. However, formal ways to measure or quantify modularity are still needed. This paper introduces an informationbased approach to meas ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
Modular structures are common in complex natural and artificial systems, and the terms “modular” or “modularity” are used throughout the engineering design literature. However, formal ways to measure or quantify modularity are still needed. This paper introduces an informationbased approach to measure modularity, built on the relationship between complexity and modularity. In this informationbased measure, a modular structure is encoded as a message describing information contained in the modular structure; the shorter the message, the higher the modularity of the structure. The information measure is dependent on the modeling and representation of the system. Following this basic idea, an approximate expression for the information measure of abstract graph structures is introduced. Since function structures in engineering design are typically represented as abstract graphs, this approach can be used to synthesize favorable modularity in parallel with the design of new systems. Using a genetic algorithm approach, with the reciprocal of the approximate measure as the fitness function, modular configurations are found in abstract graphs.