Results 1 -
9 of
9
Computing the Entropy of User Navigation in the Web
- International Journal of Information Technology and Decision Making
, 1999
"... Navigation through the web, colloquially known as "surfing", is one of the main activities of users during web interaction. When users follow a navigation trail they often tend to get disoriented in terms of the goals of their original query and thus the discovery of typical user trails could be ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Navigation through the web, colloquially known as "surfing", is one of the main activities of users during web interaction. When users follow a navigation trail they often tend to get disoriented in terms of the goals of their original query and thus the discovery of typical user trails could be useful in providing navigation assistance. Herein we give a theoretical underpinning of user navigation in terms of the entropy of an underlying Markov chain modelling the web topology. We present a novel method for online incremental computation of the entropy and a large deviation result regarding the length of a trail to realise the said entropy. We provide an error analysis for our estimation of the entropy in terms of the divergence between the empirical and actual probabilities. We also provide an extension of our technique to higher-order Markov chains by a suitable reduction of a higher-order Markov chain model to a first-order one. 1
On The Role of Pattern Matching In Information Theory
- IEEE Transactions on Information Theory
"... In this paper, the role of pattern matching information theory is motivated and discussed. We describe the relationship between a pattern's recurrence time and its probability under the data generating stochastic source. We motivate how this relationship has led to great advances in universal data-c ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
In this paper, the role of pattern matching information theory is motivated and discussed. We describe the relationship between a pattern's recurrence time and its probability under the data generating stochastic source. We motivate how this relationship has led to great advances in universal data-compression. We then describe non-asymptotic uniform bounds on the performance of data compression algorithms in cases where the size of the training data that is available to the encoder is not large enough so as to yield the asymptotic compression: the Shannon entropy. We then discuss applications of pattern matching and universal compression to universal prediction, classification and to entropy estimation. This research is partially supported by the Bi-national US-Israel Science fund y The work of A.J. Wyner was supported by NSF grant DMS-9508933. I. Introduction The self-information of a random event or a random message is a term coined by C.E.Shannon who defined it to be "minus th...
Estimation of the rate-distortion function
- 2007. [Online]. Available: http://arxiv.org/abs/cs/0702018v1
"... Motivated by questions in lossy data compression and by theoretical considerations, this paper examines the problem of estimating the rate-distortion function of an unknown (not necessarily discretevalued) source from empirical data. The main focus is the behavior of the so-called “plug-in ” estimat ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Motivated by questions in lossy data compression and by theoretical considerations, this paper examines the problem of estimating the rate-distortion function of an unknown (not necessarily discretevalued) source from empirical data. The main focus is the behavior of the so-called “plug-in ” estimator, which is simply the rate-distortion function of the empirical distribution of the observed data. Sufficient conditions are given for its consistency, and examples are provided to demonstrate that in certain cases it fails to converge to the true rate-distortion function. The analysis of the performance of the plug-in estimator is somewhat surprisingly intricate, even for stationary memoryless sources; the underlying mathematical problem is closely related to the classical problem of establishing the consistency of maximum likelihood estimators. General consistency results are given for the plug-in estimator applied to a broad class of sources, including all stationary and ergodic ones. A more general class of estimation problems is also considered, arising in the context of lossy data compression when the allowed class of coding distributions is restricted; analogous results are developed for the plug-in estimator in that case. Finally, consistency theorems are formulated for modified (e.g., penalized) versions of the plug-in estimator, and for estimating the optimal reproduction distribution.
Universal erasure entropy estimation
- In Proc. of the 2006 IEEE Intl. Symp. on Inform. Theory, (ISIT’06
, 2006
"... Abstract — Erasure entropy rate (introduced recently by Verdú and Weissman) differs from Shannon’s entropy rate in that the conditioning occurs with respect to both the past and the future, as opposed to only the past (or the future). In this paper, universal algorithms for estimating erasure entrop ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
Abstract — Erasure entropy rate (introduced recently by Verdú and Weissman) differs from Shannon’s entropy rate in that the conditioning occurs with respect to both the past and the future, as opposed to only the past (or the future). In this paper, universal algorithms for estimating erasure entropy rate are proposed based on the basic and extended context-tree weighting (CTW) algorithms. Consistency results are shown for those CTW based algorithms. Simulation results for those algorithms applied to Markov sources, tree sources and English texts are compared to those obtained by fixed-order plug-in estimators with different orders. An estimate of the erasure entropy of English texts based on the proposed algorithms is about 0.22 bits per letter, which can be compared to an estimate of about 1.3 bits per letter for the entropy rate of English texts by a similar CTW based algorithm.
Statistical techniques for text classification based on word recurrence intervals, Fluctuations and Noise Letters 3
, 2003
"... We present a method for characterizing text based on a statistical analysis of word recurrence interval. This method can be used for extracting keywords from text, and also for comparing texts by an unknown author against a set of known authors. We also use these methods to comment on the controvers ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
We present a method for characterizing text based on a statistical analysis of word recurrence interval. This method can be used for extracting keywords from text, and also for comparing texts by an unknown author against a set of known authors. We also use these methods to comment on the controversial question of who wrote the letter to the Hebrews in the New Testament.
From the entropy to the statistical structure of spike trains
- IEEE Int. Symp. on Inform. Theory
, 2006
"... Abstract — We use statistical estimates of the entropy rate of spike train data in order to make inferences about the underlying structure of the spike train itself. We first examine a number of different parametric and nonparametric estimators (some known and some new), including the “plug-in ” met ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract — We use statistical estimates of the entropy rate of spike train data in order to make inferences about the underlying structure of the spike train itself. We first examine a number of different parametric and nonparametric estimators (some known and some new), including the “plug-in ” method, several versions of Lempel-Ziv-based compression algorithms, a maximum likelihood estimator tailored to renewal processes, and the natural estimator derived from the Context-Tree Weighting method (CTW). The theoretical properties of these estimators are examined, several new theoretical results are developed, and all estimators are systematically applied to various types of synthetic data and under different conditions. Our main focus is on the performance of these entropy estimators on the (binary) spike trains of 28 neurons recorded simultaneously for a one-hour period from the primary motor and dorsal premotor cortices of a monkey. We show how the entropy estimates can be used to test for the existence of longterm structure in the data, and we construct a hypothesis test for whether the renewal process model is appropriate for these spike trains. Further, by applying the CTW algorithm we derive the maximum a posterior (MAP) tree model of our empirical data, and comment on the underlying structure it reveals. I.
Estimating the Entropy of Binary Time Series: Methodology, Some Theory and a Simulation Study
"... entropy ..."
Order Estimation of Markov Chains
, 711
"... We describe estimators χn(X0,X1,...,Xn), which when applied to an unknown stationary process taking values from a countable alphabet X, converge almost surely to k in case the process is a k-th order Markov chain and to infinity otherwise. ..."
Abstract
- Add to MetaCart
We describe estimators χn(X0,X1,...,Xn), which when applied to an unknown stationary process taking values from a countable alphabet X, converge almost surely to k in case the process is a k-th order Markov chain and to infinity otherwise.

