Results 1  10
of
24
Schemes for BiDirectional Modeling of Discrete Stationary Sources
, 2005
"... Adaptive models are developed to deal with bidirectional modeling of unknown discrete stationary sources, which can be generally applied to statistical inference problems such as noncausal universal discrete denoising that exploits bidirectional dependencies. Efficient algorithms for constructing ..."
Abstract

Cited by 14 (9 self)
 Add to MetaCart
Adaptive models are developed to deal with bidirectional modeling of unknown discrete stationary sources, which can be generally applied to statistical inference problems such as noncausal universal discrete denoising that exploits bidirectional dependencies. Efficient algorithms for constructing those models are developed and implemented. Denoising is a primary focus of the application of those models, and we compare their performance to that of the DUDE algorithm [1] for universal discrete denoising.
Learning HighDimensional Markov Forest Distributions: Analysis of Error Rates
, 1005
"... The problem of learning foreststructured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the ChowLiu tree through adaptive thresholding is proposed. It is shown that this algorithm is both structurally consistent and risk consistent and the error proba ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
The problem of learning foreststructured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the ChowLiu tree through adaptive thresholding is proposed. It is shown that this algorithm is both structurally consistent and risk consistent and the error probability of structure learning decays faster than any polynomial in the number of samples under fixed model size. For the highdimensional scenario where the size of the model d and the number of edges k scale with the number of samples n, sufficient conditions on (n,d,k) are given for the algorithm to satisfy structural and risk consistencies. In addition, the extremal structures for learning are identified; we prove that the independent (resp. tree) model is the hardest (resp. easiest) to learn using the proposed algorithm in terms of error rates for structure learning.
Rate of convergence of penalized likelihood context tree estimators
, 2007
"... Abstract: We find upper bounds for the probability of error of penalized likelihood context tree estimators, including the wellknown Bayesian Information Criterion (BIC). Our bounds are all explicit and apply to trees of bounded and unbounded depth. We show that the maximal decay for the probabilit ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
Abstract: We find upper bounds for the probability of error of penalized likelihood context tree estimators, including the wellknown Bayesian Information Criterion (BIC). Our bounds are all explicit and apply to trees of bounded and unbounded depth. We show that the maximal decay for the probability of error can be achieved with a penalizing term of the form n α, where n is the sample size and 0 < α < 1. As a consequence we obtain a strong consistency result for this penalizing term.
NUMBER OF HIDDEN STATES AND MEMORY: A JOINT ORDER ESTIMATION PROBLEM FOR MARKOV CHAINS WITH MARKOV REGIME
"... Abstract. This paper deals with order identification for Markov chains with Markov regime (MCMR) in the context of finite alphabets. We define the joint order of a MCMR process in terms of the number k of states of the hidden Markov chain and the memory m of the conditional Markov chain. We study th ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
Abstract. This paper deals with order identification for Markov chains with Markov regime (MCMR) in the context of finite alphabets. We define the joint order of a MCMR process in terms of the number k of states of the hidden Markov chain and the memory m of the conditional Markov chain. We study the properties of penalized maximum likelihood estimators for the unknown order (k, m) of an observed MCMR process, relying on information theoretic arguments. The novelty of our work relies in the joint estimation of two structural parameters. Furthermore, the different models in competition are not nested. In an asymptotic framework, we prove that a penalized maximum likelihood estimator is strongly consistent without prior bounds on k and m. We complement our theoretical work with a simulation study of its behaviour. We also study numerically the behaviour of the BIC criterion. A theoretical proof of its consistency seems to us presently out of reach for MCMR, as such a result does not yet exist in the simpler case where m = 0 (that is for hidden Markov models). Résumé. Ce travail porte sur l’identification de l’ordre d’une chaîne de Markov à régime Markovien (MCMR) sur un alphabet fini. L’ordre d’une MCMR est défini comme le couple (k, m) où k est le nombre d’états de la chaîne cachée et m la mémoire de la chaîne de Markov conditionnelle. Nous étudions des estimateurs du maximum de vraisemblance pénalisée en utilisant des techniques issues de
Exponential inequalities for empirical unbounded context trees
 of Progress in Probability, Birkhauser
, 2008
"... Abstract. In this paper we obtain exponential upper bounds for the rate of convergence of a version of the algorithm Context, when the underlying tree is not necessarily bounded. The algorithm Context is a wellknown tool to estimate the context tree of a Variable Length Markov Chain. As a consequen ..."
Abstract

Cited by 4 (4 self)
 Add to MetaCart
Abstract. In this paper we obtain exponential upper bounds for the rate of convergence of a version of the algorithm Context, when the underlying tree is not necessarily bounded. The algorithm Context is a wellknown tool to estimate the context tree of a Variable Length Markov Chain. As a consequence of the exponential bounds we obtain a strong consistency result. We generalize in this way several previous results in the field. 1.
CONSISTENT ESTIMATION OF THE BASIC NEIGHBORHOOD OF MARKOV RANDOM FIELDS
, 2006
"... For Markov random fields on Z d with finite state space, we address the statistical estimation of the basic neighborhood, the smallest region that determines the conditional distribution at a site on the condition that the values at all other sites are given. A modification of the Bayesian Informati ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
For Markov random fields on Z d with finite state space, we address the statistical estimation of the basic neighborhood, the smallest region that determines the conditional distribution at a site on the condition that the values at all other sites are given. A modification of the Bayesian Information Criterion, replacing likelihood by pseudolikelihood, is proved to provide strongly consistent estimation from observing a realization of the field on increasing finite regions: the estimated basic neighborhood equals the true one eventually almost surely, not assuming any prior bound on the size of the latter. Stationarity of the Markov field is not required, and phase transition does not affect the results. 1. Introduction. In this paper Markov
Stochastic chains with memory of variable length. Festschrift for Jorma Rissanen, Grünwald et al
 eds), TICSP Series 38:117–133
, 2008
"... Dedicated to Jorma Rissanen on his 75’th birthday Stochastic chains with memory of variable length constitute an interesting family of stochastic chains of infinite order on a finite alphabet. The idea is that for each past, only a finite suffix of the past, called context, is enough to predict the ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Dedicated to Jorma Rissanen on his 75’th birthday Stochastic chains with memory of variable length constitute an interesting family of stochastic chains of infinite order on a finite alphabet. The idea is that for each past, only a finite suffix of the past, called context, is enough to predict the next symbol. These models were first introduced in the information theory literature by Rissanen (1983) as a universal tool to perform data compression. Recently, they have been used to model up scientific data in areas as different as biology, linguistics and music. This paper presents a personal introductory guide to this class of models focusing on the algorithm Context and its rate of convergence. 1
Deinterleaving Markov Processes via Penalized ML
"... Abstract—We study the problem of deinterleaving a set of finite memory (Markov) processes over disjoint finite alphabets, which have been randomly interleaved by a memoryless random switch. The deinterleaver has access to a sample of the resulting interleaved process, but no knowledge of the number ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Abstract—We study the problem of deinterleaving a set of finite memory (Markov) processes over disjoint finite alphabets, which have been randomly interleaved by a memoryless random switch. The deinterleaver has access to a sample of the resulting interleaved process, but no knowledge of the number or structure of the Markov processes, or the parameters of the switch. We present a deinterleaving scheme based on minimizing a penalized maximumlikelihood cost function, and show it to be strongly consistent, in the sense of reconstructing, almost surely as the observed sequence length tends to infinity, the original Markov and switch processes. Solutions are described for the case where a bound on the order of the Markov processes is available, and for the case where it is not. We demonstrate that the proposed scheme performs well in practice, requiring much shorter input sequences for reliable deinterleaving than previous solutions. I.
RANDOM PERTURBATIONS OF STOCHASTIC CHAINS WITH UNBOUNDED VARIABLE LENGTH MEMORY
, 707
"... Abstract. We consider binary infinite order stochastic chains perturbed by a random noise. This means that at each time step, the value assumed by the chain can be randomly and independently flipped with a small fixed probability. We show that the transition probabilities of the perturbed chain are ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Abstract. We consider binary infinite order stochastic chains perturbed by a random noise. This means that at each time step, the value assumed by the chain can be randomly and independently flipped with a small fixed probability. We show that the transition probabilities of the perturbed chain are uniformly close to the corresponding transition probabilities of the original chain. As a consequence, in the case of stochastic chains with unbounded but otherwise finite variable length memory, we show that it is possible to recover the context tree of the original chain, using a suitable version of the algorithm Context, provided that the noise is small enough. 1.
Testing statistical hypothesis on random trees
, 2006
"... In this paper we address the problem of identifying differences between populations of trees. An example of such populations are estimations of the context tree of a Variable Length Markov Chain, an important modeling tool that have been used recently for protein classification without sequence alig ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
In this paper we address the problem of identifying differences between populations of trees. An example of such populations are estimations of the context tree of a Variable Length Markov Chain, an important modeling tool that have been used recently for protein classification without sequence alignment. Our approach is based on a hypothesis test proposed recently by Balding et al (2004) (BFFS–test), which involves a Kolmogorov type statistics that roughly speaking, maximizes the difference between the expected distance structure that characterize the samples of the populations. This characteristic make it suitable even for applications where the populations have the same expected mean tree, but a different occupancy node probability (marginal expected value) at some node. A naive approach to calculate effectively the test statistic is quite difficult, since it is based on a supremo defined over the space of all trees, which grows exponentially fast. We show how to transform this problem into a maxflow over a network which can be solved using a Ford Fulkerson algorithm in polynomial time on the maximal number of nodes of the random tree. We also describe conditions that imply the characterization of the measure by the marginal distributions of each node (occupancy node probabilities) of the random tree, which validate the use of the BFFS–test for measure discrimination. We study the performance of the test via simulations on GaltonWatson processes. We also discuss a real data example of genomics. We consider a protein functionality family as a random variable assuming values in the space of context trees and conjecture that different families induce different random variables. The context tree of a protein is then seen as a realization of the random variable corresponding to its family. We test (simultaneously) differences among 10 families of proteins, transforming their amino acid chains into trees via the PST algorithm from Bejerano et al (2004). 1