Results 1  10
of
22
Convergence properties of functional estimates for discrete distributions. Random Struct. Algorithms
, 2001
"... ABSTRACT: Suppose P is an arbitrary discrete distribution on a countable alphabet . Given an i.i.d. sample X1 Xn drawn from P, we consider the problem of estimating the entropy HP or some other functional F = FP of the unknown distribution P. We show that, for additive functionals satisfy ..."
Abstract

Cited by 45 (3 self)
 Add to MetaCart
(Show Context)
ABSTRACT: Suppose P is an arbitrary discrete distribution on a countable alphabet . Given an i.i.d. sample X1 Xn drawn from P, we consider the problem of estimating the entropy HP or some other functional F = FP of the unknown distribution P. We show that, for additive functionals satisfying mild conditions (including the cases of the mean, the entropy, and mutual information), the plugin estimates of F are universally consistent. We also prove that, without further assumptions, no rateofconvergence results can be obtained for any sequence of estimators. In the case of entropy estimation, under a variety of different assumptions, we get rateofconvergence results for the plugin estimate and for a nonparametric estimator based on matchlengths. The behavior of the variance and the expected error of the plugin estimate is shown to be in sharp contrast to the finitealphabet case. A number of other important examples of functionals are also treated in some detail.
Fifty Years of Shannon Theory
, 1998
"... A brief chronicle is given of the historical development of the central problems in the theory of fundamental limits of data compression and reliable communication. ..."
Abstract

Cited by 38 (0 self)
 Add to MetaCart
A brief chronicle is given of the historical development of the central problems in the theory of fundamental limits of data compression and reliable communication.
On The Role of Pattern Matching In Information Theory
 IEEE TRANSACTIONS ON INFORMATION THEORY
"... In this paper, the role of pattern matching information theory is motivated and discussed. We describe the relationship between a pattern's recurrence time and its probability under the data generating stochastic source. We motivate how this relationship has led to great advances in universal d ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
(Show Context)
In this paper, the role of pattern matching information theory is motivated and discussed. We describe the relationship between a pattern's recurrence time and its probability under the data generating stochastic source. We motivate how this relationship has led to great advances in universal datacompression. We then describe nonasymptotic uniform bounds on the performance of data compression algorithms in cases where the size of the training data that is available to the encoder is not large enough so as to yield the asymptotic compression: the Shannon entropy. We then discuss applications of pattern matching and universal compression to universal prediction, classification and to entropy estimation.
Estimating Entropy Rates with Bayesian Confidence Intervals
 NEURAL COMPUTATION 17, 1531–1576 (2005)
, 2005
"... The entropy rate quantifies the amount of uncertainty or disorder produced by any dynamical system. In a spiking neuron, this uncertainty translates into the amount of information potentially encoded and thus the subject of intense theoretical and experimental investigation. Estimating this quantity ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
The entropy rate quantifies the amount of uncertainty or disorder produced by any dynamical system. In a spiking neuron, this uncertainty translates into the amount of information potentially encoded and thus the subject of intense theoretical and experimental investigation. Estimating this quantity in observed, experimental data is difficult and requires a judicious selection of probabilistic models, balancing between two opposing biases. We use a model weighting principle originally developed for lossless data compression, following the minimum description length principle. This weighting yields a direct estimator of the entropy rate, which, compared to existing methods, exhibits significantly less bias and converges faster in simulation. With Monte Carlo techinques, we estimate a Bayesian confidence interval for the entropy rate. In related work, we apply these ideas to estimate the information rates between sensory stimuli and neural responses in experimental data (Shlens, Kennel, Abarbanel, & Chichilnisky, 2004).
Computing the Entropy of User Navigation in the Web
 International Journal of Information Technology and Decision Making
, 1999
"... Navigation through the web, colloquially known as "surfing", is one of the main activities of users during web interaction. When users follow a navigation trail they often tend to get disoriented in terms of the goals of their original query and thus the discovery of typical user trails ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
Navigation through the web, colloquially known as "surfing", is one of the main activities of users during web interaction. When users follow a navigation trail they often tend to get disoriented in terms of the goals of their original query and thus the discovery of typical user trails could be useful in providing navigation assistance. Herein we give a theoretical underpinning of user navigation in terms of the entropy of an underlying Markov chain modelling the web topology. We present a novel method for online incremental computation of the entropy and a large deviation result regarding the length of a trail to realise the said entropy. We provide an error analysis for our estimation of the entropy in terms of the divergence between the empirical and actual probabilities. We also provide an extension of our technique to higherorder Markov chains by a suitable reduction of a higherorder Markov chain model to a firstorder one. 1
Estimating Information Rates with Confidence Intervals in Neural Spike Trains
, 2007
"... Information theory provides a natural set of statistics to quantify the amount of knowledge a neuron conveys about a stimulus. A related work (Kennel, Shlens, Abarbanel, & Chichilnisky, 2005) demonstrated how to reliably estimate, with a Bayesian confidence interval, the entropy rate from a dis ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
Information theory provides a natural set of statistics to quantify the amount of knowledge a neuron conveys about a stimulus. A related work (Kennel, Shlens, Abarbanel, & Chichilnisky, 2005) demonstrated how to reliably estimate, with a Bayesian confidence interval, the entropy rate from a discrete, observed time series. We extend this method to measure the rate of novel information that a neural spike train encodes about a stimulus—the average and specific mutual information rates. Our estimator makes few assumptions about the underlying neural dynamics, shows excellent performance in experimentally relevant regimes, and uniquely provides confidence intervals bounding the range of information rates compatible with the observed spike train. We validate this estimator with simulations of spike trains and highlight how stimulus parameters affect its convergence in bias and variance. Finally, we apply these ideas to a recording from a guinea pig retinal ganglion cell and compare results to a simple linear decoder.
Estimating the Entropy of Binary Time Series: Methodology, Some Theory and a Simulation Study
"... entropy ..."
(Show Context)
The linux pseudorandom number generator revisited. IACR Cryptology ePrint Archive
"... The Linux pseudorandom number generator (PRNG) is a PRNG with entropy inputs which is widely used in many security related applications and protocols. This PRNG is written as an open source code which is subject to regular changes. It was last analyzed in the work of Gutterman et al. in 2006 [GPR06] ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
The Linux pseudorandom number generator (PRNG) is a PRNG with entropy inputs which is widely used in many security related applications and protocols. This PRNG is written as an open source code which is subject to regular changes. It was last analyzed in the work of Gutterman et al. in 2006 [GPR06] but since then no new analysis has been made available, while in the meantime several changes have been applied to the code, among others, to counter the attacks presented in [GPR06]. Our work describes the Linux PRNG of kernel versions 2.6.30.7 and upwards. We detail the PRNG architecture in the Linux system and provide its first accurate mathematical description and a precise analysis of the building blocks, including entropy estimation and extraction. Subsequently, we give a security analysis including the feasibility of cryptographic attacks and an empirical test of the entropy estimator. Finally, we underline some important changes to the previous versions and their consequences. 1
Universal erasure entropy estimation
 In Proc. of the 2006 IEEE Intl. Symp. on Inform. Theory, (ISIT’06
, 2006
"... Abstract — Erasure entropy rate (introduced recently by Verdú and Weissman) differs from Shannon’s entropy rate in that the conditioning occurs with respect to both the past and the future, as opposed to only the past (or the future). In this paper, universal algorithms for estimating erasure entrop ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
Abstract — Erasure entropy rate (introduced recently by Verdú and Weissman) differs from Shannon’s entropy rate in that the conditioning occurs with respect to both the past and the future, as opposed to only the past (or the future). In this paper, universal algorithms for estimating erasure entropy rate are proposed based on the basic and extended contexttree weighting (CTW) algorithms. Consistency results are shown for those CTW based algorithms. Simulation results for those algorithms applied to Markov sources, tree sources and English texts are compared to those obtained by fixedorder plugin estimators with different orders. An estimate of the erasure entropy of English texts based on the proposed algorithms is about 0.22 bits per letter, which can be compared to an estimate of about 1.3 bits per letter for the entropy rate of English texts by a similar CTW based algorithm.
Statistical techniques for text classification based on word recurrence intervals, Fluctuations and Noise Letters 3
, 2003
"... We present a method for characterizing text based on a statistical analysis of word recurrence interval. This method can be used for extracting keywords from text, and also for comparing texts by an unknown author against a set of known authors. We also use these methods to comment on the controvers ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
(Show Context)
We present a method for characterizing text based on a statistical analysis of word recurrence interval. This method can be used for extracting keywords from text, and also for comparing texts by an unknown author against a set of known authors. We also use these methods to comment on the controversial question of who wrote the letter to the Hebrews in the New Testament.