Results 1  10
of
54
Schemes for BiDirectional Modeling of Discrete Stationary Sources
, 2005
"... Adaptive models are developed to deal with bidirectional modeling of unknown discrete stationary sources, which can be generally applied to statistical inference problems such as noncausal universal discrete denoising that exploits bidirectional dependencies. Efficient algorithms for constructing ..."
Abstract

Cited by 16 (9 self)
 Add to MetaCart
(Show Context)
Adaptive models are developed to deal with bidirectional modeling of unknown discrete stationary sources, which can be generally applied to statistical inference problems such as noncausal universal discrete denoising that exploits bidirectional dependencies. Efficient algorithms for constructing those models are developed and implemented. Denoising is a primary focus of the application of those models, and we compare their performance to that of the DUDE algorithm [1] for universal discrete denoising.
Stochastic chains with memory of variable length. Festschrift for Jorma Rissanen, Grünwald et al
 eds), TICSP Series 38:117–133
, 2008
"... Dedicated to Jorma Rissanen on his 75’th birthday Stochastic chains with memory of variable length constitute an interesting family of stochastic chains of infinite order on a finite alphabet. The idea is that for each past, only a finite suffix of the past, called context, is enough to predict the ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
Dedicated to Jorma Rissanen on his 75’th birthday Stochastic chains with memory of variable length constitute an interesting family of stochastic chains of infinite order on a finite alphabet. The idea is that for each past, only a finite suffix of the past, called context, is enough to predict the next symbol. These models were first introduced in the information theory literature by Rissanen (1983) as a universal tool to perform data compression. Recently, they have been used to model up scientific data in areas as different as biology, linguistics and music. This paper presents a personal introductory guide to this class of models focusing on the algorithm Context and its rate of convergence. 1
CONSISTENT ESTIMATION OF THE BASIC NEIGHBORHOOD OF MARKOV RANDOM FIELDS
, 2006
"... For Markov random fields on Z d with finite state space, we address the statistical estimation of the basic neighborhood, the smallest region that determines the conditional distribution at a site on the condition that the values at all other sites are given. A modification of the Bayesian Informati ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
(Show Context)
For Markov random fields on Z d with finite state space, we address the statistical estimation of the basic neighborhood, the smallest region that determines the conditional distribution at a site on the condition that the values at all other sites are given. A modification of the Bayesian Information Criterion, replacing likelihood by pseudolikelihood, is proved to provide strongly consistent estimation from observing a realization of the field on increasing finite regions: the estimated basic neighborhood equals the true one eventually almost surely, not assuming any prior bound on the size of the latter. Stationarity of the Markov field is not required, and phase transition does not affect the results. 1. Introduction. In this paper Markov
Learning HighDimensional Markov Forest Distributions: Analysis of Error Rates
, 1005
"... The problem of learning foreststructured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the ChowLiu tree through adaptive thresholding is proposed. It is shown that this algorithm is both structurally consistent and risk consistent and the error proba ..."
Abstract

Cited by 13 (8 self)
 Add to MetaCart
The problem of learning foreststructured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the ChowLiu tree through adaptive thresholding is proposed. It is shown that this algorithm is both structurally consistent and risk consistent and the error probability of structure learning decays faster than any polynomial in the number of samples under fixed model size. For the highdimensional scenario where the size of the model d and the number of edges k scale with the number of samples n, sufficient conditions on (n,d,k) are given for the algorithm to satisfy structural and risk consistencies. In addition, the extremal structures for learning are identified; we prove that the independent (resp. tree) model is the hardest (resp. easiest) to learn using the proposed algorithm in terms of error rates for structure learning.
Modeling the
 DOCSIS 1.1/2.0 MAC Protocol”, ICCCN03
, 2003
"... universal data compression; enumerative coding; tree models; Markov sources; method of types Efficient enumerative coding for tree sources is, in general, surprisingly intricate a simple uniform encoding of type classes, which is asymptotically optimal in expectation for many classical models such ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
(Show Context)
universal data compression; enumerative coding; tree models; Markov sources; method of types Efficient enumerative coding for tree sources is, in general, surprisingly intricate a simple uniform encoding of type classes, which is asymptotically optimal in expectation for many classical models such as FSMs, turns out not to be so in this case. We describe an efficiently computable enumerative code that is universal in the family of tree models in the sense that, for a string emitted by an unknown source whose model is supported on a known tree, the expected normalized code length of the encoding approaches the entropy rate of the source with a convergence rate (K/2)(log n)/n, where K is the number of free parameters of the model family. Based on recent results characterizing type classes of context trees, the code consists of the index of the sequence in the tree type class, and an efficient description of the class itself using a nonuniform encoding of selected string counts. The results are extended to a twiceuniversal setting, where the tree underlying the source model is unknown.
Exponential inequalities for empirical unbounded context trees
 of Progress in Probability, Birkhauser
, 2008
"... Abstract. In this paper we obtain exponential upper bounds for the rate of convergence of a version of the algorithm Context, when the underlying tree is not necessarily bounded. The algorithm Context is a wellknown tool to estimate the context tree of a Variable Length Markov Chain. As a consequen ..."
Abstract

Cited by 9 (5 self)
 Add to MetaCart
Abstract. In this paper we obtain exponential upper bounds for the rate of convergence of a version of the algorithm Context, when the underlying tree is not necessarily bounded. The algorithm Context is a wellknown tool to estimate the context tree of a Variable Length Markov Chain. As a consequence of the exponential bounds we obtain a strong consistency result. We generalize in this way several previous results in the field. 1.
Testing statistical hypothesis on random trees
, 2006
"... In this paper we address the problem of identifying differences between populations of trees. An example of such populations are estimations of the context tree of a Variable Length Markov Chain, an important modeling tool that have been used recently for protein classification without sequence alig ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
In this paper we address the problem of identifying differences between populations of trees. An example of such populations are estimations of the context tree of a Variable Length Markov Chain, an important modeling tool that have been used recently for protein classification without sequence alignment. Our approach is based on a hypothesis test proposed recently by Balding et al (2004) (BFFS–test), which involves a Kolmogorov type statistics that roughly speaking, maximizes the difference between the expected distance structure that characterize the samples of the populations. This characteristic make it suitable even for applications where the populations have the same expected mean tree, but a different occupancy node probability (marginal expected value) at some node. A naive approach to calculate effectively the test statistic is quite difficult, since it is based on a supremo defined over the space of all trees, which grows exponentially fast. We show how to transform this problem into a maxflow over a network which can be solved using a Ford Fulkerson algorithm in polynomial time on the maximal number of nodes of the random tree. We also describe conditions that imply the characterization of the measure by the marginal distributions of each node (occupancy node probabilities) of the random tree, which validate the use of the BFFS–test for measure discrimination. We study the performance of the test via simulations on GaltonWatson processes. We also discuss a real data example of genomics. We consider a protein functionality family as a random variable assuming values in the space of context trees and conjecture that different families induce different random variables. The context tree of a protein is then seen as a realization of the random variable corresponding to its family. We test (simultaneously) differences among 10 families of proteins, transforming their amino acid chains into trees via the PST algorithm from Bejerano et al (2004). 1
NUMBER OF HIDDEN STATES AND MEMORY: A JOINT ORDER ESTIMATION PROBLEM FOR MARKOV CHAINS WITH MARKOV REGIME
"... Abstract. This paper deals with order identification for Markov chains with Markov regime (MCMR) in the context of finite alphabets. We define the joint order of a MCMR process in terms of the number k of states of the hidden Markov chain and the memory m of the conditional Markov chain. We study th ..."
Abstract

Cited by 5 (4 self)
 Add to MetaCart
(Show Context)
Abstract. This paper deals with order identification for Markov chains with Markov regime (MCMR) in the context of finite alphabets. We define the joint order of a MCMR process in terms of the number k of states of the hidden Markov chain and the memory m of the conditional Markov chain. We study the properties of penalized maximum likelihood estimators for the unknown order (k, m) of an observed MCMR process, relying on information theoretic arguments. The novelty of our work relies in the joint estimation of two structural parameters. Furthermore, the different models in competition are not nested. In an asymptotic framework, we prove that a penalized maximum likelihood estimator is strongly consistent without prior bounds on k and m. We complement our theoretical work with a simulation study of its behaviour. We also study numerically the behaviour of the BIC criterion. A theoretical proof of its consistency seems to us presently out of reach for MCMR, as such a result does not yet exist in the simpler case where m = 0 (that is for hidden Markov models). Résumé. Ce travail porte sur l’identification de l’ordre d’une chaîne de Markov à régime Markovien (MCMR) sur un alphabet fini. L’ordre d’une MCMR est défini comme le couple (k, m) où k est le nombre d’états de la chaîne cachée et m la mémoire de la chaîne de Markov conditionnelle. Nous étudions des estimateurs du maximum de vraisemblance pénalisée en utilisant des techniques issues de
The computational structure of spike trains
 Neural Computation
, 2010
"... Neurons perform computations, and convey the results of those computations through the statistical structure of their output spike trains. Here we present a practical method, grounded in the informationtheoretic analysis of prediction, for inferring a minimal representation of that structure and fo ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
Neurons perform computations, and convey the results of those computations through the statistical structure of their output spike trains. Here we present a practical method, grounded in the informationtheoretic analysis of prediction, for inferring a minimal representation of that structure and for characterizing its complexity. Starting from spike trains, our approach finds their causal state models (CSMs), the minimal hidden Markov models or stochastic automata capable of generating statisticallyidentical time series. We then use these CSMs to objectively quantify both the generalizable structure and the idiosyncratic randomness of the spike train. Specifically, we show that the expected algorithmic information content (the information needed to describe the spike train exactly) can be split into three parts describing (1) the timeinvariant structure (complexity) of the minimal spikegenerating process, which describes the spike train statistically, (2) the randomness (internal entropy rate) of the minimal spikegenerating process, and (3) a residual pure noise term not described by the minimal spike generating process. We use CSMs to approximate each of these quantities. The CSMs are inferred nonparametrically from the data, making only mild regularity assumptions, via the Causal State Splitting Reconstruction (CSSR) algorithm. The methods presented here complement more traditional spike train analyses by describing not only spiking probability, and spike train entropy, but also the complexity of a spike train’s structure. We demonstrate our approach using both simulated spike trains and experimental data recorded in rat barrel cortex during vibrissa stimulation. I.