Results 11  20
of
58
An Universal Predictor Based on Pattern Matching
 IEEE Trans. Inform. Theory
, 2000
"... We consider here an universal predictor based on pattern matching. For a given string x 1 ; x 2 ; : : : ; xn , the predictor will guess the next symbol xn+1 in such a way that the prediction error tends to zero as n ! 1 provided the string x n 1 = x 1 ; x 2 ; : : : ; xn is generated by a mixing s ..."
Abstract

Cited by 26 (1 self)
 Add to MetaCart
We consider here an universal predictor based on pattern matching. For a given string x 1 ; x 2 ; : : : ; xn , the predictor will guess the next symbol xn+1 in such a way that the prediction error tends to zero as n ! 1 provided the string x n 1 = x 1 ; x 2 ; : : : ; xn is generated by a mixing source. We shall prove that the rate of convergence of the prediction error is O(n \Gamma" ) for any " ? 0. In this preliminary version, we only prove our results for memoryless sources and a sketch for mixing sources. However, we indicate that our algorithm can predict equally successfully the next k symbols as long as k = O(1). 1 Introduction Prediction is important in communication, control, forecasting, investment and other areas. We understand how to do optimal prediction when the data model is known, but one needs to design universal prediction algorithm that will perform well no matter what the underlying probabilistic model is. More precisely, let X 1 ; X 2 ; : : : be an infinite ...
Semantically Motivated Improvements for PPM Variants
 The Computer Journal
, 1997
"... This paper explains how to significantly improve the compression performance of any PPM variant ..."
Abstract

Cited by 25 (3 self)
 Add to MetaCart
(Show Context)
This paper explains how to significantly improve the compression performance of any PPM variant
Markov Types and Minimax Redundancy for Markov Sources
 IEEE Trans. Information Theory
, 2003
"... Redundancy of universal codes for a class of sources determines by how much the actual code length exceeds the optimal code length. In the minimax scenario one designs the best code for the worst source within the class. Such minimax redundancy comes in two flavors: either on average or for individu ..."
Abstract

Cited by 18 (10 self)
 Add to MetaCart
Redundancy of universal codes for a class of sources determines by how much the actual code length exceeds the optimal code length. In the minimax scenario one designs the best code for the worst source within the class. Such minimax redundancy comes in two flavors: either on average or for individual sequences. The latter is also known as the maximal or the worst case minimax redundancy. We study the maximal minimax redundancy of universal block codes for Markovian sources of any order. We prove that the maximal minimax redundancy for Markov sources of order r is asymptotically equal to 1) log 2 n + log 2 A (ln ln m 1/(m1) )/ ln m + o(1), where n is the length of a source sequence, m is the size of the alphabet and A m is an explicit constant (e.g., we find that for a binary alphabet m = 2 and Markov of order r = 1 the constant 14.655449504 where G is the Catalan number). Unlike previous attempts, we view the redundancy problem as an asymptotic evaluation of certain sums over a set of matrices representing Markov types. The enumeration of Markov types is accomplished by reducing it to counting Eulerian paths in a multigraph. In particular, we propose an asymptotic formula for the number of strings of a given Markov type. All of these findings are obtained by analytic and combinatorial tools of analysis of algorithms. Index terms: Minimax redundancy, Markov sources, Markov types, Eulerian paths, multidimensional generating functions, analytic information theory. # A preliminary version of this paper was presented at Colloquium on Mathematics and Computer Science: Algorithms, Trees, Combinatorics and Probabilities, Versailles, 2002.
Hierarchical Universal Coding
 IEEE Trans. Inform. Theory
, 1998
"... In an earlier paper, we proved a strong version of the redundancycapacity converse theorem of universal coding, stating that for `most' sources in a given class, the universal coding redundancy is essentially lower bounded by the capacity of the channel induced by this class. Since this result ..."
Abstract

Cited by 16 (3 self)
 Add to MetaCart
(Show Context)
In an earlier paper, we proved a strong version of the redundancycapacity converse theorem of universal coding, stating that for `most' sources in a given class, the universal coding redundancy is essentially lower bounded by the capacity of the channel induced by this class. Since this result holds for general classes of sources, it extends Rissanen's strong converse theorem for parametric families. While our earlier result has established strong optimality only for mixture codes weighted by the capacityachieving prior, our first result herein extends this finding to a general prior. For some cases our technique also leads to a simplified proof of the above mentioned strong converse theorem. The major interest in this paper, however, is in extending the theory of universal coding to hierarchical structures of classes, where each class may have a different capacity. In this setting, one wishes to incur redundancy essentially as small as that corresponding to the active class, and not ...
OnLine Stochastic Processes in Data Compression
, 1996
"... The ability to predict the future based upon the past in finitealphabet sequences has many applications, including communications, data security, pattern recognition, and natural language processing. By Shannon's theory and the breakthrough development of arithmetic coding, any sequence, a 1 ..."
Abstract

Cited by 15 (6 self)
 Add to MetaCart
The ability to predict the future based upon the past in finitealphabet sequences has many applications, including communications, data security, pattern recognition, and natural language processing. By Shannon's theory and the breakthrough development of arithmetic coding, any sequence, a 1 a 2 \Delta \Delta \Delta a n , can be encoded in a number of bits that is essentially equal to the minimal informationlossless codelength, P i \Gamma log 2 p(a i ja 1 \Delta \Delta \Delta a i\Gamma1 ). The goal of universal online modeling, and therefore of universal data compression, is to deduce the model of the input sequence a 1 a 2 \Delta \Delta \Delta a n that can estimate each p(a i ja 1 \Delta \Delta \Delta a i\Gamma1 ) knowing only a 1 a 2 \Delta \Delta \Delta a i\Gamma1 so that the ex...
New Techniques for Context Modeling
, 1995
"... We introduce three new techniques for statistical language models: extension modeling, nonmonotonic contexts, and the divergence heuristic. Together these techniques result in language models that have few states, even fewer parameters, and low message entropies. ..."
Abstract

Cited by 15 (2 self)
 Add to MetaCart
We introduce three new techniques for statistical language models: extension modeling, nonmonotonic contexts, and the divergence heuristic. Together these techniques result in language models that have few states, even fewer parameters, and low message entropies.
Model Selection for Variable Length Markov Chains and Tuning the Context Algorithm
, 2000
"... We consider the model selection problem in the class of stationary variable length Markov chains (VLMC) on a nite space. The processes in this class are still Markovian of higher order, but with memory of variable length. Various aims in selecting a VLMC can be formalized with dierent nonequivalent ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
We consider the model selection problem in the class of stationary variable length Markov chains (VLMC) on a nite space. The processes in this class are still Markovian of higher order, but with memory of variable length. Various aims in selecting a VLMC can be formalized with dierent nonequivalent risks, such as nal prediction error or expected KullbackLeibler information. We consider the asymptotic behavior of dierent risk functions and show how they can be generally estimated with the same resampling strategy. Such estimated risks then yield new model selection criteria. In particular, we obtain a datadriven tuning of Rissanen's tree structured context algorithm which is a computationally feasible procedure for selection and estimation of a VLMC. Key words and phrases. Bootstrap, zeroone loss, nal prediction error, nitememory source, FSMX model, KullbackLeibler information, L 2 loss, optimal tree pruning, resampling, tree model. Short title: Selecting variable length Mar...
Modeling the
 DOCSIS 1.1/2.0 MAC Protocol”, ICCCN03
, 2003
"... universal data compression; enumerative coding; tree models; Markov sources; method of types Efficient enumerative coding for tree sources is, in general, surprisingly intricate a simple uniform encoding of type classes, which is asymptotically optimal in expectation for many classical models such ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
(Show Context)
universal data compression; enumerative coding; tree models; Markov sources; method of types Efficient enumerative coding for tree sources is, in general, surprisingly intricate a simple uniform encoding of type classes, which is asymptotically optimal in expectation for many classical models such as FSMs, turns out not to be so in this case. We describe an efficiently computable enumerative code that is universal in the family of tree models in the sense that, for a string emitted by an unknown source whose model is supported on a known tree, the expected normalized code length of the encoding approaches the entropy rate of the source with a convergence rate (K/2)(log n)/n, where K is the number of free parameters of the model family. Based on recent results characterizing type classes of context trees, the code consists of the index of the sequence in the tree type class, and an efficient description of the class itself using a nonuniform encoding of selected string counts. The results are extended to a twiceuniversal setting, where the tree underlying the source model is unknown.
Computing the Entropy of User Navigation in the Web
 International Journal of Information Technology and Decision Making
, 1999
"... Navigation through the web, colloquially known as "surfing", is one of the main activities of users during web interaction. When users follow a navigation trail they often tend to get disoriented in terms of the goals of their original query and thus the discovery of typical user trails ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
Navigation through the web, colloquially known as "surfing", is one of the main activities of users during web interaction. When users follow a navigation trail they often tend to get disoriented in terms of the goals of their original query and thus the discovery of typical user trails could be useful in providing navigation assistance. Herein we give a theoretical underpinning of user navigation in terms of the entropy of an underlying Markov chain modelling the web topology. We present a novel method for online incremental computation of the entropy and a large deviation result regarding the length of a trail to realise the said entropy. We provide an error analysis for our estimation of the entropy in terms of the divergence between the empirical and actual probabilities. We also provide an extension of our technique to higherorder Markov chains by a suitable reduction of a higherorder Markov chain model to a firstorder one. 1
Precise Average Redundancy of an Idealized Arithmetic
 Coding, Proc. Data Compression Conference, 222231, Snowbird
, 2002
"... ..."