Results 1  10
of
38
Universal Discrete Denoising: Known Channel
 IEEE Trans. Inform. Theory
, 2003
"... A discrete denoising algorithm estimates the input sequence to a discrete memoryless channel (DMC) based on the observation of the entire output sequence. For the case in which the DMC is known and the quality of the reconstruction is evaluated with a given singleletter fidelity criterion, we pr ..."
Abstract

Cited by 79 (32 self)
 Add to MetaCart
A discrete denoising algorithm estimates the input sequence to a discrete memoryless channel (DMC) based on the observation of the entire output sequence. For the case in which the DMC is known and the quality of the reconstruction is evaluated with a given singleletter fidelity criterion, we propose a discrete denoising algorithm that does not assume knowledge of statistical properties of the input sequence. Yet, the algorithm is universal in the sense of asymptotically performing as well as the optimum denoiser that knows the input sequence distribution, which is only assumed to be stationary and ergodic. Moreover, the algorithm is universal also in a semistochastic setting, in which the input is an individual sequence, and the randomness is due solely to the channel noise.
A tutorial introduction to the minimum description length principle
 in Advances in Minimum Description Length: Theory and Applications. 2005
"... ..."
Model Selection by Normalized Maximum Likelihood
, 2005
"... The Minimum Description Length (MDL) principle is an information theoretic approach to inductive inference that originated in algorithmic coding theory. In this approach, data are viewed as codes to be compressed by the model. From this perspective, models are compared on their ability to compress a ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
The Minimum Description Length (MDL) principle is an information theoretic approach to inductive inference that originated in algorithmic coding theory. In this approach, data are viewed as codes to be compressed by the model. From this perspective, models are compared on their ability to compress a data set by extracting useful information in the data apart from random noise. The goal of model selection is to identify the model, from a set of candidate models, that permits the shortest description length (code) of the data. Since Rissanen originally formalized the problem using the crude ‘twopart code ’ MDL method in the 1970s, many significant strides have been made, especially in the 1990s, with the culmination of the development of the refined ‘universal code’ MDL method, dubbed Normalized Maximum Likelihood (NML). It represents an elegant solution to the model selection problem. The present paper provides a tutorial review on these latest developments with a special focus on NML. An application example of NML in cognitive modeling is also provided.
An empirical study of minimum description length model selection with infinite parametric complexity
 JOURNAL OF MATHEMATICAL PSYCHOLOGY
, 2006
"... Parametric complexity is a central concept in Minimum Description Length (MDL) model selection. In practice it often turns out to be infinite, even for quite simple models such as the Poisson and Geometric families. In such cases, MDL model selection as based on NML and Bayesian inference based on J ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
Parametric complexity is a central concept in Minimum Description Length (MDL) model selection. In practice it often turns out to be infinite, even for quite simple models such as the Poisson and Geometric families. In such cases, MDL model selection as based on NML and Bayesian inference based on Jeffreys ’ prior can not be used. Several ways to resolve this problem have been proposed. We conduct experiments to compare and evaluate their behaviour on small sample sizes. We find interestingly poor behaviour for the plugin predictive code; a restricted NML model performs quite well but it is questionable if the results validate its theoretical motivation. A Bayesian marginal distribution with Jeffreys’ prior can still be used if one sacrifices the first observation to make a proper posterior; this approach turns out to be most dependable.
The empirical distribution of rateconstrained source codes
 IEEE Trans. Inform. Theory
"... Let X =(X1,...) be a stationary ergodic finitealphabet source, X n denote its first n symbols, and Y n be the codeword assigned to X n by a lossy source code. The empirical kthorder joint distribution ˆ Q k [X n,Y n](x k,y k)is defined as the frequency of appearances of pairs of kstrings (x k,y k ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
Let X =(X1,...) be a stationary ergodic finitealphabet source, X n denote its first n symbols, and Y n be the codeword assigned to X n by a lossy source code. The empirical kthorder joint distribution ˆ Q k [X n,Y n](x k,y k)is defined as the frequency of appearances of pairs of kstrings (x k,y k)alongthepair(X n,Y n). Our main interest is in the sample behavior of this (random) distribution. Letting I(Q k) denote the mutual information I(X k; Y k) when (X k,Y k) ∼ Q k we show that for any (sequence of) lossy source code(s) of rate ≤ R lim sup n→∞ 1 k I ˆQ k n n
Bayesian Network Structure Learning using Factorized NML Universal Models
, 2008
"... Universal codes/models can be used for data compression and model selection by the minimum description length (MDL) principle. For many interesting model classes, such as Bayesian networks, the minimax regret optimal normalized maximum likelihood (NML) universal model is computationally very deman ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
Universal codes/models can be used for data compression and model selection by the minimum description length (MDL) principle. For many interesting model classes, such as Bayesian networks, the minimax regret optimal normalized maximum likelihood (NML) universal model is computationally very demanding. We suggest a computationally feasible alternative to NML for Bayesian networks, the factorized NML universal model, where the normalization is done locally for each variable. This can be seen as an approximate sumproduct algorithm. We show that this new universal model performs extremely well in model selection, compared to the existing stateoftheart, even for small sample sizes.
On sequentially normalized maximum likelihood models
 in: Workshop on Information Theoretic Methods in Science and Engineering (WITMSE08
, 2008
"... The important normalized maximum likelihood (NML) distribution is obtained via a normalization over all sequences of given length. It has two shortcomings: the resulting model is usually not a random process, and in ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
The important normalized maximum likelihood (NML) distribution is obtained via a normalization over all sequences of given length. It has two shortcomings: the resulting model is usually not a random process, and in
MDL denoising revisited
 IEEE Transactions on Signal Processing, 57(9):3347 – 3360
, 2009
"... Abstract — We refine and extend an earlier MDL denoising criterion for waveletbased denoising. We start by showing that the denoising problem can be reformulated as a clustering problem, where the goal is to obtain separate clusters for informative and noninformative wavelet coefficients, respecti ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
Abstract — We refine and extend an earlier MDL denoising criterion for waveletbased denoising. We start by showing that the denoising problem can be reformulated as a clustering problem, where the goal is to obtain separate clusters for informative and noninformative wavelet coefficients, respectively. This suggests two refinements, adding a codelength for the model index, and extending the model in order to account for subbanddependent coefficient distributions. A third refinement is derivation of soft thresholding inspired by predictive universal coding with weighted mixtures. We propose a practical method incorporating all three refinements, which is shown to achieve good performance and robustness in denoising both artificial and natural signals. Index Terms — Minimum description length (MDL) principle, wavelets, denoising. I.
An empirical study of MDL model selection with infinite parametric complexity
 J. Mathematical Psychology
, 2006
"... Parametric complexity is a central concept in MDL model selection. In practice it often turns out to be infinite, even for quite simple models such as the Poisson and Geometric families. In such cases, MDL model selection as based on NML and Bayesian inference based on Jeffreys ’ prior can not be us ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
Parametric complexity is a central concept in MDL model selection. In practice it often turns out to be infinite, even for quite simple models such as the Poisson and Geometric families. In such cases, MDL model selection as based on NML and Bayesian inference based on Jeffreys ’ prior can not be used. Several ways to resolve this problem have been proposed. We conduct experiments to compare and evaluate their behaviour on small sample sizes. We find interestingly poor behaviour for the plugin predictive code; a restricted NML model performs quite well but it is questionable if the results validate its theoretical motivation. The Bayesian model with the improper Jeffreys ’ prior is the most dependable. 1