Results 1  10
of
11
On The Role of Pattern Matching In Information Theory
 IEEE TRANSACTIONS ON INFORMATION THEORY
"... In this paper, the role of pattern matching information theory is motivated and discussed. We describe the relationship between a pattern's recurrence time and its probability under the data generating stochastic source. We motivate how this relationship has led to great advances in universal datac ..."
Abstract

Cited by 15 (2 self)
 Add to MetaCart
In this paper, the role of pattern matching information theory is motivated and discussed. We describe the relationship between a pattern's recurrence time and its probability under the data generating stochastic source. We motivate how this relationship has led to great advances in universal datacompression. We then describe nonasymptotic uniform bounds on the performance of data compression algorithms in cases where the size of the training data that is available to the encoder is not large enough so as to yield the asymptotic compression: the Shannon entropy. We then discuss applications of pattern matching and universal compression to universal prediction, classification and to entropy estimation.
Computing the Entropy of User Navigation in the Web
 International Journal of Information Technology and Decision Making
, 1999
"... Navigation through the web, colloquially known as "surfing", is one of the main activities of users during web interaction. When users follow a navigation trail they often tend to get disoriented in terms of the goals of their original query and thus the discovery of typical user trails could be ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
Navigation through the web, colloquially known as "surfing", is one of the main activities of users during web interaction. When users follow a navigation trail they often tend to get disoriented in terms of the goals of their original query and thus the discovery of typical user trails could be useful in providing navigation assistance. Herein we give a theoretical underpinning of user navigation in terms of the entropy of an underlying Markov chain modelling the web topology. We present a novel method for online incremental computation of the entropy and a large deviation result regarding the length of a trail to realise the said entropy. We provide an error analysis for our estimation of the entropy in terms of the divergence between the empirical and actual probabilities. We also provide an extension of our technique to higherorder Markov chains by a suitable reduction of a higherorder Markov chain model to a firstorder one. 1
Estimating the Entropy of Binary Time Series: Methodology, Some Theory and a Simulation Study
"... entropy ..."
Universal erasure entropy estimation
 In Proc. of the 2006 IEEE Intl. Symp. on Inform. Theory, (ISIT’06
, 2006
"... Abstract — Erasure entropy rate (introduced recently by Verdú and Weissman) differs from Shannon’s entropy rate in that the conditioning occurs with respect to both the past and the future, as opposed to only the past (or the future). In this paper, universal algorithms for estimating erasure entrop ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
Abstract — Erasure entropy rate (introduced recently by Verdú and Weissman) differs from Shannon’s entropy rate in that the conditioning occurs with respect to both the past and the future, as opposed to only the past (or the future). In this paper, universal algorithms for estimating erasure entropy rate are proposed based on the basic and extended contexttree weighting (CTW) algorithms. Consistency results are shown for those CTW based algorithms. Simulation results for those algorithms applied to Markov sources, tree sources and English texts are compared to those obtained by fixedorder plugin estimators with different orders. An estimate of the erasure entropy of English texts based on the proposed algorithms is about 0.22 bits per letter, which can be compared to an estimate of about 1.3 bits per letter for the entropy rate of English texts by a similar CTW based algorithm.
Estimation of the ratedistortion function
 2007. [Online]. Available: http://arxiv.org/abs/cs/0702018v1
"... Motivated by questions in lossy data compression and by theoretical considerations, this paper examines the problem of estimating the ratedistortion function of an unknown (not necessarily discretevalued) source from empirical data. The main focus is the behavior of the socalled “plugin ” estimat ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Motivated by questions in lossy data compression and by theoretical considerations, this paper examines the problem of estimating the ratedistortion function of an unknown (not necessarily discretevalued) source from empirical data. The main focus is the behavior of the socalled “plugin ” estimator, which is simply the ratedistortion function of the empirical distribution of the observed data. Sufficient conditions are given for its consistency, and examples are provided to demonstrate that in certain cases it fails to converge to the true ratedistortion function. The analysis of the performance of the plugin estimator is somewhat surprisingly intricate, even for stationary memoryless sources; the underlying mathematical problem is closely related to the classical problem of establishing the consistency of maximum likelihood estimators. General consistency results are given for the plugin estimator applied to a broad class of sources, including all stationary and ergodic ones. A more general class of estimation problems is also considered, arising in the context of lossy data compression when the allowed class of coding distributions is restricted; analogous results are developed for the plugin estimator in that case. Finally, consistency theorems are formulated for modified (e.g., penalized) versions of the plugin estimator, and for estimating the optimal reproduction distribution.
Statistical techniques for text classification based on word recurrence intervals, Fluctuations and Noise Letters 3
, 2003
"... We present a method for characterizing text based on a statistical analysis of word recurrence interval. This method can be used for extracting keywords from text, and also for comparing texts by an unknown author against a set of known authors. We also use these methods to comment on the controvers ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
We present a method for characterizing text based on a statistical analysis of word recurrence interval. This method can be used for extracting keywords from text, and also for comparing texts by an unknown author against a set of known authors. We also use these methods to comment on the controversial question of who wrote the letter to the Hebrews in the New Testament.
From the entropy to the statistical structure of spike trains
 IEEE Int. Symp. on Inform. Theory
, 2006
"... Abstract — We use statistical estimates of the entropy rate of spike train data in order to make inferences about the underlying structure of the spike train itself. We first examine a number of different parametric and nonparametric estimators (some known and some new), including the “plugin ” met ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Abstract — We use statistical estimates of the entropy rate of spike train data in order to make inferences about the underlying structure of the spike train itself. We first examine a number of different parametric and nonparametric estimators (some known and some new), including the “plugin ” method, several versions of LempelZivbased compression algorithms, a maximum likelihood estimator tailored to renewal processes, and the natural estimator derived from the ContextTree Weighting method (CTW). The theoretical properties of these estimators are examined, several new theoretical results are developed, and all estimators are systematically applied to various types of synthetic data and under different conditions. Our main focus is on the performance of these entropy estimators on the (binary) spike trains of 28 neurons recorded simultaneously for a onehour period from the primary motor and dorsal premotor cortices of a monkey. We show how the entropy estimates can be used to test for the existence of longterm structure in the data, and we construct a hypothesis test for whether the renewal process model is appropriate for these spike trains. Further, by applying the CTW algorithm we derive the maximum a posterior (MAP) tree model of our empirical data, and comment on the underlying structure it reveals. I.
Order Estimation of Markov Chains
, 711
"... We describe estimators χn(X0,X1,...,Xn), which when applied to an unknown stationary process taking values from a countable alphabet X, converge almost surely to k in case the process is a kth order Markov chain and to infinity otherwise. ..."
Abstract
 Add to MetaCart
We describe estimators χn(X0,X1,...,Xn), which when applied to an unknown stationary process taking values from a countable alphabet X, converge almost surely to k in case the process is a kth order Markov chain and to infinity otherwise.
LETTER Communicated by Jonathan Victor Estimating Entropy Rates with Bayesian Confidence Intervals
"... The entropy rate quantifies the amount of uncertainty or disorder produced by any dynamical system. In a spiking neuron, this uncertainty translates into the amount of information potentially encoded and thus the subject of intense theoretical and experimental investigation. Estimating this quantity ..."
Abstract
 Add to MetaCart
The entropy rate quantifies the amount of uncertainty or disorder produced by any dynamical system. In a spiking neuron, this uncertainty translates into the amount of information potentially encoded and thus the subject of intense theoretical and experimental investigation. Estimating this quantity in observed, experimental data is difficult and requires a judicious selection of probabilistic models, balancing between two opposing biases. We use a model weighting principle originally developed for lossless data compression, following the minimum description length principle. This weighting yields a direct estimator of the entropy rate, which, compared to existing methods, exhibits significantly less bias and converges faster in simulation. With Monte Carlo techinques, we estimate Neural Computation 17, 1531–1576 (2005) © 2005 Massachusetts Institute of Technology1532 M. Kennel, J. Shlens, H. Abarbanel, and E. Chichilnisky a Bayesian confidence interval for the entropy rate. In related work, we apply these ideas to estimate the information rates between sensory stimuli and neural responses in experimental data (Shlens, Kennel, Abarbanel, & Chichilnisky, 2004). 1