Results 11  20
of
41
Learning HighDimensional Markov Forest Distributions: Analysis of Error Rates
, 1005
"... The problem of learning foreststructured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the ChowLiu tree through adaptive thresholding is proposed. It is shown that this algorithm is both structurally consistent and risk consistent and the error proba ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
The problem of learning foreststructured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the ChowLiu tree through adaptive thresholding is proposed. It is shown that this algorithm is both structurally consistent and risk consistent and the error probability of structure learning decays faster than any polynomial in the number of samples under fixed model size. For the highdimensional scenario where the size of the model d and the number of edges k scale with the number of samples n, sufficient conditions on (n,d,k) are given for the algorithm to satisfy structural and risk consistencies. In addition, the extremal structures for learning are identified; we prove that the independent (resp. tree) model is the hardest (resp. easiest) to learn using the proposed algorithm in terms of error rates for structure learning.
Estimating and testing the order of a model
, 2002
"... This paper deals with order identification for nested models in the i.i.d. framework. We study the asymptotic efficiency of two generalized likelihood ratio tests of the order. They are based on two estimators which are proved to be strongly consistent. A version of Stein’s lemma yields an optimal u ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
This paper deals with order identification for nested models in the i.i.d. framework. We study the asymptotic efficiency of two generalized likelihood ratio tests of the order. They are based on two estimators which are proved to be strongly consistent. A version of Stein’s lemma yields an optimal underestimation error exponent. The lemma also implies that the overestimation error exponent is necessarily trivial. Our tests admit nontrivial underestimation error exponents. The optimal underestimation error exponent is achieved in some situations. The overestimation error can decay exponentially with respect to a positive power of the number of observations. These results are proved under mild assumptions by relating the underestimation (resp. overestimation) error to large (resp. moderate) deviations of the loglikelihood process. In particular, it is not necessary that the classical Cramér condition be satisfied; namely, the logdensities are not required to admit every exponential moment. Three benchmark examples with specific difficulties (location mixture of normal distributions, abrupt changes and various regressions) are detailed so as to illustrate the generality of our results.
CONCENTRATION INEQUALITIES AND ESTIMATION OF CONDITIONAL PROBABILITIES
"... Abstract. We prove concentration inequalities inspired from [DP] to obtain estimators of conditional probabilities for weak dependant sequences. This generalize results from Csiszár ([Cs]). For Gibbs measures and dynamical systems, these results lead to construct estimators of the potential function ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
Abstract. We prove concentration inequalities inspired from [DP] to obtain estimators of conditional probabilities for weak dependant sequences. This generalize results from Csiszár ([Cs]). For Gibbs measures and dynamical systems, these results lead to construct estimators of the potential function and also to test the nullity of the asymptotic variance of the system. This paper deals with the problems of typicality and conditional typicality of “empirical probabilities ” for stochastic process and the estimation of potential functions for Gibbs measures and dynamical systems. The questions of typicality have been studied in [FKT] for independent sequences, in [BRY, R] for Markov chains. In order to prove the consistency of estimators of transition probability for Markov chains of unknown order, results on typicality and conditional typicality for some (Ψ)mixing process where obtained in [CsS, Cs]. Unfortunately, lots of natural mixing process do not satisfy this Ψmixing condition (see [DP]). We consider a class of mixing process inspired from [DP]. For this class, we prove strong typicality and strong conditional typicality. In the particular case of Gibbs measures (or complete connexions chains) and for certain dynamical systems, from the typicality results we derive an estimation of the potential as well as procedure to test the nullity of the asymptotic variance of the process. More formally, we consider X0,...., Xn,... a stochastic process taking values on an complete set Σ and a sequence of countable partitions of Σ, (Pk)k∈N such that if P ∈ Pk then there exists a unique � P ∈ Pk−1 such that almost surely, Xj ∈ P ⇒ Xj−1 ∈ � P. Our aim is to obtain empirical estimates on the probabilities: P(Xj ∈ P), P ∈ Pk, and the conditional probabilities:
MDL denoising revisited
 IEEE Transactions on Signal Processing, 57(9):3347 – 3360
, 2009
"... Abstract — We refine and extend an earlier MDL denoising criterion for waveletbased denoising. We start by showing that the denoising problem can be reformulated as a clustering problem, where the goal is to obtain separate clusters for informative and noninformative wavelet coefficients, respecti ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
Abstract — We refine and extend an earlier MDL denoising criterion for waveletbased denoising. We start by showing that the denoising problem can be reformulated as a clustering problem, where the goal is to obtain separate clusters for informative and noninformative wavelet coefficients, respectively. This suggests two refinements, adding a codelength for the model index, and extending the model in order to account for subbanddependent coefficient distributions. A third refinement is derivation of soft thresholding inspired by predictive universal coding with weighted mixtures. We propose a practical method incorporating all three refinements, which is shown to achieve good performance and robustness in denoising both artificial and natural signals. Index Terms — Minimum description length (MDL) principle, wavelets, denoising. I.
Order Estimation for a Special Class of Hidden Markov Sources and Binary Renewal Processes
 IEEE Trans. Inform. Theory
, 2002
"... We consider the estimation of the order, i.e., the number of hidden states, of a special class of discretetime finitealphabet hidden Markov sources. This class can be characterized in terms of equivalent renewal processes. No a priori bound is assumed on the maximum permissible order. An order est ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
We consider the estimation of the order, i.e., the number of hidden states, of a special class of discretetime finitealphabet hidden Markov sources. This class can be characterized in terms of equivalent renewal processes. No a priori bound is assumed on the maximum permissible order. An order estimator based on renewal types is constructed, and is shown to be strongly consistent by computing the precise asymptotics of the probability of estimation error. The probability of underestimation of the true order decays exponentially in the number of observations while the probability of overestimation goes to zero sufficiently fast. It is further shown that this estimator has the best possible error exponent in a large class of estimators. Our results are also valid for the general class of binary independentrenewal processes with finite mean renewal times.
Two new Markov order estimators
 Arxiv Preprint Math/0506080
, 2005
"... Abstract. We present two new methods for estimating the order (memory depth) of a finite alphabet Markov chain from observation of a sample path. One method is based on entropy estimation via recurrence times of patterns, and the other relies on a comparison of empirical conditional probabilities. T ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Abstract. We present two new methods for estimating the order (memory depth) of a finite alphabet Markov chain from observation of a sample path. One method is based on entropy estimation via recurrence times of patterns, and the other relies on a comparison of empirical conditional probabilities. The key to both methods is a qualitative change that occurs when a parameter (a candidate for the order) passes the true order. We also present extensions to order estimation for Markov random fields.
NUMBER OF HIDDEN STATES AND MEMORY: A JOINT ORDER ESTIMATION PROBLEM FOR MARKOV CHAINS WITH MARKOV REGIME
"... Abstract. This paper deals with order identification for Markov chains with Markov regime (MCMR) in the context of finite alphabets. We define the joint order of a MCMR process in terms of the number k of states of the hidden Markov chain and the memory m of the conditional Markov chain. We study th ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
Abstract. This paper deals with order identification for Markov chains with Markov regime (MCMR) in the context of finite alphabets. We define the joint order of a MCMR process in terms of the number k of states of the hidden Markov chain and the memory m of the conditional Markov chain. We study the properties of penalized maximum likelihood estimators for the unknown order (k, m) of an observed MCMR process, relying on information theoretic arguments. The novelty of our work relies in the joint estimation of two structural parameters. Furthermore, the different models in competition are not nested. In an asymptotic framework, we prove that a penalized maximum likelihood estimator is strongly consistent without prior bounds on k and m. We complement our theoretical work with a simulation study of its behaviour. We also study numerically the behaviour of the BIC criterion. A theoretical proof of its consistency seems to us presently out of reach for MCMR, as such a result does not yet exist in the simpler case where m = 0 (that is for hidden Markov models). Résumé. Ce travail porte sur l’identification de l’ordre d’une chaîne de Markov à régime Markovien (MCMR) sur un alphabet fini. L’ordre d’une MCMR est défini comme le couple (k, m) où k est le nombre d’états de la chaîne cachée et m la mémoire de la chaîne de Markov conditionnelle. Nous étudions des estimateurs du maximum de vraisemblance pénalisée en utilisant des techniques issues de
CONSISTENT ESTIMATION OF THE BASIC NEIGHBORHOOD OF MARKOV RANDOM FIELDS
, 2006
"... For Markov random fields on Z d with finite state space, we address the statistical estimation of the basic neighborhood, the smallest region that determines the conditional distribution at a site on the condition that the values at all other sites are given. A modification of the Bayesian Informati ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
For Markov random fields on Z d with finite state space, we address the statistical estimation of the basic neighborhood, the smallest region that determines the conditional distribution at a site on the condition that the values at all other sites are given. A modification of the Bayesian Information Criterion, replacing likelihood by pseudolikelihood, is proved to provide strongly consistent estimation from observing a realization of the field on increasing finite regions: the estimated basic neighborhood equals the true one eventually almost surely, not assuming any prior bound on the size of the latter. Stationarity of the Markov field is not required, and phase transition does not affect the results. 1. Introduction. In this paper Markov
Rate of convergence of penalized likelihood context tree estimators
, 2007
"... Abstract: We find upper bounds for the probability of error of penalized likelihood context tree estimators, including the wellknown Bayesian Information Criterion (BIC). Our bounds are all explicit and apply to trees of bounded and unbounded depth. We show that the maximal decay for the probabilit ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
Abstract: We find upper bounds for the probability of error of penalized likelihood context tree estimators, including the wellknown Bayesian Information Criterion (BIC). Our bounds are all explicit and apply to trees of bounded and unbounded depth. We show that the maximal decay for the probability of error can be achieved with a penalizing term of the form n α, where n is the sample size and 0 < α < 1. As a consequence we obtain a strong consistency result for this penalizing term.
Estimating Components in Finite Mixtures and Hidden Markov Models
, 2003
"... When the unobservable Markov chain in a hidden Markov model is stationary the marginal distribution of the observations is a finite mixture with the number of terms equal to the number of the states of the Markov chain. This suggests estimating the number of states of the unobservable Markov chain b ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
When the unobservable Markov chain in a hidden Markov model is stationary the marginal distribution of the observations is a finite mixture with the number of terms equal to the number of the states of the Markov chain. This suggests estimating the number of states of the unobservable Markov chain by determining the number of mixture components in the marginal distribution. We therefore present new methods for estimating the number of states in a hidden Markov model, and coincidentally the unknown number of components in a finite mixture, based on penalized quasilikelihood and generalized quasilikelihood ratio methods constructed from the marginal distribution. The procedures advocated are simple to calculate and results obtained in empirical applications indicate that they are as effective as current available methods based on the full likelihood. We show that, under fairly general regularity conditions, the methods proposed will generate strongly consistent estimates of the unknown number of states or components. Some key words: finite mixture, hidden Markov process, model selection, number of states, penalized quasilikelihood, generalized quasilikelihood ratio, strong consistency. 1