Results 1  10
of
54
The consistency of the BIC Markov order estimator.
"... . The Bayesian Information Criterion (BIC) estimates the order of a Markov chain (with finite alphabet A) from observation of a sample path x 1 ; x 2 ; : : : ; x n , as that value k = k that minimizes the sum of the negative logarithm of the kth order maximum likelihood and the penalty term jAj ..."
Abstract

Cited by 56 (3 self)
 Add to MetaCart
. The Bayesian Information Criterion (BIC) estimates the order of a Markov chain (with finite alphabet A) from observation of a sample path x 1 ; x 2 ; : : : ; x n , as that value k = k that minimizes the sum of the negative logarithm of the kth order maximum likelihood and the penalty term jAj k (jAj\Gamma1) 2 log n: We show that k equals the correct order of the chain, eventually almost surely as n ! 1, thereby strengthening earlier consistency results that assumed an apriori bound on the order. A key tool is a strong ratiotypicality result for Markov sample paths. We also show that the Bayesian estimator or minimum description length estimator, of which the BIC estimator is an approximation, fails to be consistent for the uniformly distributed i.i.d. process. AMS 1991 subject classification: Primary 62F12, 62M05; Secondary 62F13, 60J10 Key words and phrases: Bayesian Information Criterion, order estimation, ratiotypicality, Markov chains. 1 Supported in part by a joint N...
Inequalities for the occurrence times of rare events in mixing processes. The state of the art.
 MARKOV PROC. RELAT. FIELDS
, 2000
"... The first occurrence time of a rare event in a mixing process typically has a distribution which can be well approximated by the exponential law. In this paper we review recent theorems giving upper bounds for the error term of this approximation. We shall focus on papers treating the problem in a g ..."
Abstract

Cited by 21 (3 self)
 Add to MetaCart
The first occurrence time of a rare event in a mixing process typically has a distribution which can be well approximated by the exponential law. In this paper we review recent theorems giving upper bounds for the error term of this approximation. We shall focus on papers treating the problem in a general mixing framework. Running title: Rare events in mixing processes.
The Interactions Between Ergodic Theory and Information Theory
 IEEE Transactions on Information Theory
, 1998
"... Information theorists frequently use the ergodic theorem; likewise entropy concepts are often used in information theory. Recently the two subjects have become partially intertwined as deeper results from each discipline find use in the other. A brief history of this interaction is presented in this ..."
Abstract

Cited by 17 (0 self)
 Add to MetaCart
Information theorists frequently use the ergodic theorem; likewise entropy concepts are often used in information theory. Recently the two subjects have become partially intertwined as deeper results from each discipline find use in the other. A brief history of this interaction is presented in this paper, together with a more detailed look at three areas of connection, namely, recurrence theory, blowingup bounds, and direct samplepath methods.
On The Role of Pattern Matching In Information Theory
 IEEE TRANSACTIONS ON INFORMATION THEORY
"... In this paper, the role of pattern matching information theory is motivated and discussed. We describe the relationship between a pattern's recurrence time and its probability under the data generating stochastic source. We motivate how this relationship has led to great advances in universal datac ..."
Abstract

Cited by 15 (2 self)
 Add to MetaCart
In this paper, the role of pattern matching information theory is motivated and discussed. We describe the relationship between a pattern's recurrence time and its probability under the data generating stochastic source. We motivate how this relationship has led to great advances in universal datacompression. We then describe nonasymptotic uniform bounds on the performance of data compression algorithms in cases where the size of the training data that is available to the encoder is not large enough so as to yield the asymptotic compression: the Shannon entropy. We then discuss applications of pattern matching and universal compression to universal prediction, classification and to entropy estimation.
DYNAMICS OF BAYESIAN UPDATING WITH DEPENDENT DATA AND MISSPECIFIED MODELS
, 2009
"... Recent work on the convergence of posterior distributions under Bayesian updating has established conditions under which the posterior will concentrate on the truth, if the latter has a perfect representation within the support of the prior, and under various dynamical assumptions, such as the data ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
Recent work on the convergence of posterior distributions under Bayesian updating has established conditions under which the posterior will concentrate on the truth, if the latter has a perfect representation within the support of the prior, and under various dynamical assumptions, such as the data being independent and identically distributed or Markovian. Here I establish sufficient conditions for the convergence of the posterior distribution in nonparametric problems even when all of the hypotheses are wrong, and the datagenerating process has a complicated dependence structure. The main dynamical assumption is the generalized asymptotic equipartition (or “ShannonMcMillanBreiman”) property of information theory. I derive a kind of large deviations principle for the posterior measure, and discuss the advantages of predicting using a combination of models known to be wrong. An appendix sketches connections between the present results and the “replicator dynamics” of evolutionary theory.
The Kolmogorov sampler
, 2002
"... iid 2 Given noisy observations Xi = θi + Zi, i =1,...,n, with noise Zi ∼ N(0,σ), we wish to recover the signal θ with small meansquared error. We consider the Minimum Kolmogorov Complexity Estimator (MKCE), defined roughly as the nvector ˆ θ(X) solving the problem min Y K(Y) subject to �X − Y �2 l ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
iid 2 Given noisy observations Xi = θi + Zi, i =1,...,n, with noise Zi ∼ N(0,σ), we wish to recover the signal θ with small meansquared error. We consider the Minimum Kolmogorov Complexity Estimator (MKCE), defined roughly as the nvector ˆ θ(X) solving the problem min Y K(Y) subject to �X − Y �2 l 2 n ≤ σ2 · n, where K(Y) denotes the length of the shortest computer program that can compute the finiteprecision nvector Y.Inwords, this is the simplest object that fits the data to within the lackoffit between θ and X that would be expected on statistical grounds. Suppose that the θi are successive samples from a stationary ergodic process obeying
Channel Simulation by Interval Algorithm: A Performance Analysis of Interval Algorithm
 IEEE Trans. Inform. Theory
"... This paper deals with the problem of simulating a discrete memoryless channel and proposes two algorithms for channel simulation by using the interval algorithm. The first algorithm provides exact channel simulation and the number of fair random bits per input sample approaches the conditional resol ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
This paper deals with the problem of simulating a discrete memoryless channel and proposes two algorithms for channel simulation by using the interval algorithm. The first algorithm provides exact channel simulation and the number of fair random bits per input sample approaches the conditional resolvability of the channel with probability one. The second algorithm provides approximate channel simulation and the approximation error measured by the variational distance vanishes exponentially as the block length tends to infinity, when the number of fair random bits per input sample is above the conditional resolvability. Further, some asymptotic properties of these algorithms as well as the original interval algorithm for random number generation are clarified. Keywords: channel simulation, interval algorithm, conditional resolvability, conditional entropy, random number generation 2 I.
Unifying Text Search And Compression  Suffix Sorting, Block Sorting and Suffix Arrays
, 2000
"... Today many electronic documents are available such as articles of newspapers, dictionaries, books, DNA sequences, etc. and they are stored in databases. We also have many documents on the Internet and have many email documents. Therefore, fast queries on such huge amount of documents and their comp ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
Today many electronic documents are available such as articles of newspapers, dictionaries, books, DNA sequences, etc. and they are stored in databases. We also have many documents on the Internet and have many email documents. Therefore, fast queries on such huge amount of documents and their compression to reduce costs for storing or transferring them are important. In this thesis, a unified method for improving efficiency of search and compression for huge text data is proposed. All search methods and compression methods used in this thesis are related to a data structure called suffix array. The suffix array is a text search data structure and it is used in a text compression method called block sorting. Both are promising search method and compression method and there are many studies on the methods. Now a data structure called inverted file is used for queries from huge amount of documents. Though it is widely used, query unit is a document in order to reduce disk space to sto...
SHORTEST SPANNING TREES AND A COUNTEREXAMPLE FOR RANDOM WALKS IN RANDOM ENVIRONMENTS
"... Abstract. We construct forests spanning Z d, d ≥ 2, that are stationary and directed, and whose trees are infinite but are as short as possible. For d ≥ 3, two independent copies of such forests, pointing into opposite directions, can be pruned so as to become disjoint. From this, we construct in d ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
Abstract. We construct forests spanning Z d, d ≥ 2, that are stationary and directed, and whose trees are infinite but are as short as possible. For d ≥ 3, two independent copies of such forests, pointing into opposite directions, can be pruned so as to become disjoint. From this, we construct in d ≥ 3 a stationary, polynomially mixing and uniformly elliptic environment of nearestneighbor transition probabilities on Z d, for which the corresponding random walk (RWRE) disobeys a certain zeroone law for directional transience. 1.