Results 11  20
of
47
Strong consistency of the GoodTuring estimator
 in IEEE Int. Symp. Inf. Theor. Proc
, 2006
"... ..."
(Show Context)
Concentration Bounds for Unigrams Language Model
, 2004
"... We show several PACstyle concentration bounds for learning unigrams language model. One interesting quantity is the probability of all words appearing exactly k times in a sample of size m. A standard estimator for this quantity is the GoodTuring estimator. The existing analysis on its error shows ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
We show several PACstyle concentration bounds for learning unigrams language model. One interesting quantity is the probability of all words appearing exactly k times in a sample of size m. A standard estimator for this quantity is the GoodTuring estimator. The existing analysis on its error shows a PAC bound of approximately O . We improve its dependency on k to O 4 # k . We also analyze the empirical frequencies estimator, showing that its PAC error bound is approximately . We derive a combined estimator, which has , for any k. A standard measure...
Superior Guarantees for Sequential Prediction and Lossless Compression via Alphabet Decomposition
"... We present worst case bounds for the learning rate of a known prediction method that is based on hierarchical applications of binary context tree weighting (CTW) predictors. A heuristic application of this approach that relies on Huffman’s alphabet decomposition is known to achieve stateoftheart p ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
We present worst case bounds for the learning rate of a known prediction method that is based on hierarchical applications of binary context tree weighting (CTW) predictors. A heuristic application of this approach that relies on Huffman’s alphabet decomposition is known to achieve stateoftheart performance in prediction and lossless compression benchmarks. We show that our new bound for this heuristic is tighter than the best known performance guarantees for prediction and lossless compression algorithms in various settings. This result substantiates the efficiency of this hierarchical method and provides a compelling explanation for its practical success. In addition, we present the results of a few experiments that examine other possibilities for improving the multialphabet prediction performance of CTWbased algorithms.
Universal compression of Markov and related sources over arbitrary alphabets
 IEEE TRANSACTIONS ON INFORMATION THEORY
, 2006
"... Recent work has considered encoding a string by separately conveying its symbols and its pattern—the order in which the symbols appear. It was shown that the patterns of i.i.d. strings can be losslessly compressed with diminishing persymbol redundancy. In this paper the pattern redundancy of distri ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Recent work has considered encoding a string by separately conveying its symbols and its pattern—the order in which the symbols appear. It was shown that the patterns of i.i.d. strings can be losslessly compressed with diminishing persymbol redundancy. In this paper the pattern redundancy of distributions with memory is considered. Close lower and upper bounds are established on the pattern redundancy of strings generated by Hidden Markov Models with a small number of states, showing in particular that their persymbol pattern redundancy diminishes with increasing string length. The upper bounds are obtained by analyzing the growth rate of the number of multidimensional integer partitions, and the lower bounds, using Hayman’s Theorem.
A better GoodTuring estimator for sequence probabilities
 in Proc. IEEE Int. Symp. Inf. Theory
"... ..."
(Show Context)
A Universal Compression Perspective of Smoothing
"... We analyze smoothing algorithms from a universalcompression perspective. Instead of evaluating their performance on an empirical sample, we analyze their performance on the most inconvenient sample possible. Consequently the performance of the algorithm can be guaranteed even on unseen data. We sho ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
We analyze smoothing algorithms from a universalcompression perspective. Instead of evaluating their performance on an empirical sample, we analyze their performance on the most inconvenient sample possible. Consequently the performance of the algorithm can be guaranteed even on unseen data. We show that universal compression bounds can explain the empirical performance of several smoothing methods. We also describe a new interpolated additive smoothing algorithm, and show that it has lower training complexity and better compression performance than existing smoothing techniques. Key words: Language modeling, universal compression, smoothing 1
HAYMAN ADMISSIBLE FUNCTIONS IN SEVERAL VARIABLES
"... Abstract. An alternative generalisation of Hayman’s admissible functions ([17]) to functions in several variables is developed and a multivariate asymptotic expansion for the coefficients is proved. In contrast to existing generalisations of Hayman admissibility ([7]), most of the closure properties ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract. An alternative generalisation of Hayman’s admissible functions ([17]) to functions in several variables is developed and a multivariate asymptotic expansion for the coefficients is proved. In contrast to existing generalisations of Hayman admissibility ([7]), most of the closure properties which are satisfied by Hayman’s admissible functions can be shown to hold for this class of functions as well. 1.