Results 11 -
14 of
14
Maximum Entropy Good-Turing Estimator for Language Modeling
"... In this paper, we propose a new formulation of the classical Good-Turing estimator for-gram language model. The new approach is based on defining a dynamic model for language production. Instead of assuming a fixed probability distribution of occurrence of an-gram on the whole text, we propose a max ..."
Abstract
- Add to MetaCart
In this paper, we propose a new formulation of the classical Good-Turing estimator for-gram language model. The new approach is based on defining a dynamic model for language production. Instead of assuming a fixed probability distribution of occurrence of an-gram on the whole text, we propose a maximum entropy approximation of a time varying distribution. This approximation led us to a new distribution, which in turn is used to calculate expectations of the Good-Turing estimator. This defines a new estimator that we call Maximum Entropy Good-Turing estimator. Contrary to the classical Good-Turing estimator it needs neither expectations approximations nor windowing or other smoothing techniques. It also contains the well know discounting estimators as special cases. Performance is evaluated both in terms of perplexity and word error rate in an N-best re-scoring task. Also comparison to other classical estimators is performed. In all cases our approach performs significantly better than classical estimators. 1.
A New Estimator Based on Maximum Entropy
"... In this paper, we propose a new formulation of the classical Good-Turing estimator for n-gram language model. The new approach is based on defining a dynamic model for language production. Instead of assuming a fixed probability distribution of occurrence of an n-gram on the whole text, we propose a ..."
Abstract
- Add to MetaCart
In this paper, we propose a new formulation of the classical Good-Turing estimator for n-gram language model. The new approach is based on defining a dynamic model for language production. Instead of assuming a fixed probability distribution of occurrence of an n-gram on the whole text, we propose a maximum entropy approximation of a time varying distribution. This approximation led us to a new distribution, which in turn is used to calculate expectations of the Good-Turing estimator. This defines a new estimator that we call Maximum Entropy Good-Turing estimator. Contrary to the classical Good-Turing estimator it needs neither expectations approximations nor windowing or other smoothing techniques. It also contains the well know discounting estimators as special cases. Performance is evaluated both in terms of perplexity and word error rate in an N-best re-scoring task. Also comparison to other classical estimators is performed. In all cases our approach performs significantly better than classical estimators. 1.
Dyna: Extending Datalog For Modern AI ⋆
"... Abstract. Modern statistical AI systems are quite large and complex; this interferes with research, development, and education. We point out that most of the computation involves database-like queries and updates on complex views of the data. Specifically, recursive queries look up and aggregate rel ..."
Abstract
- Add to MetaCart
Abstract. Modern statistical AI systems are quite large and complex; this interferes with research, development, and education. We point out that most of the computation involves database-like queries and updates on complex views of the data. Specifically, recursive queries look up and aggregate relevant or potentially relevant values. If the results of these queries are memoized for reuse, the memos may need to be updated through change propagation. We propose a declarative language, which generalizes Datalog, to support this work in a generic way. Through examples, we show that a broad spectrum of AIalgorithms can be concisely captured by writing down systems of equations in our notation. Many strategies could be used to actually solve those systems. Our examples motivatecertainextensionstoDatalog, whichareconnectedtofunctional and object-oriented programming paradigms. 1 Why a New Data-Oriented Language for AI? Modern AI systems are frustratingly big, making them time-consuming to engineer
Lecture 11: The Good-Turing Estimate
, 2010
"... In many language-related tasks, it would be extremely useful to know the probability that a sentence or word sequence will occur in a document. However, there is not enough data to account for all word sequences. Thus, n-gram models are used to approximate the probability of word sequences. Making a ..."
Abstract
- Add to MetaCart
In many language-related tasks, it would be extremely useful to know the probability that a sentence or word sequence will occur in a document. However, there is not enough data to account for all word sequences. Thus, n-gram models are used to approximate the probability of word sequences. Making an

