Results 1  10
of
770,502
Estimating Renyi Entropy of Discrete Distributions
"... It was recently shown that estimating the Shannon entropy H(p) of a discrete ksymbol distribution p requires Θ(k / log k) samples, a number that grows nearlinearly in the support size. In many applications H(p) can be replaced by the more general Rényi entropy of order α, Hα(p). We determine the ..."
Abstract
 Add to MetaCart
It was recently shown that estimating the Shannon entropy H(p) of a discrete ksymbol distribution p requires Θ(k / log k) samples, a number that grows nearlinearly in the support size. In many applications H(p) can be replaced by the more general Rényi entropy of order α, Hα(p). We determine
1Estimating Renyi Entropy of Discrete Distributions
"... It was recently shown that estimating the Shannon entropy H(p) of a discrete ksymbol distribution p requires Θ(k / log k) samples, a number that grows nearlinearly in the support size. In many applications H(p) can be replaced by the more general Rényi entropy of order α, Hα(p). We determine the ..."
Abstract
 Add to MetaCart
It was recently shown that estimating the Shannon entropy H(p) of a discrete ksymbol distribution p requires Θ(k / log k) samples, a number that grows nearlinearly in the support size. In many applications H(p) can be replaced by the more general Rényi entropy of order α, Hα(p). We determine
Maximum entropy markov models for information extraction and segmentation
, 2000
"... Hidden Markov models (HMMs) are a powerful probabilistic tool for modeling sequential data, and have been applied with success to many textrelated tasks, such as partofspeech tagging, text segmentation and information extraction. In these cases, the observations are usually modeled as multinomial ..."
Abstract

Cited by 554 (18 self)
 Add to MetaCart
as multinomial distributions over a discrete vocabulary, and the HMM parameters are set to maximize the likelihood of the observations. This paper presents a new Markovian sequence model, closely related to HMMs, that allows observations to be represented as arbitrary overlapping features (such as word
Approximating discrete probability distributions with dependence trees
 IEEE TRANSACTIONS ON INFORMATION THEORY
, 1968
"... A method is presented to approximate optimally an ndimensional discrete probability distribution by a product of secondorder distributions, or the distribution of the firstorder tree dependence. The problem is to find an optimum set of n1 first order dependence relationship among the n variables ..."
Abstract

Cited by 874 (0 self)
 Add to MetaCart
A method is presented to approximate optimally an ndimensional discrete probability distribution by a product of secondorder distributions, or the distribution of the firstorder tree dependence. The problem is to find an optimum set of n1 first order dependence relationship among the n
Estimating the Support of a HighDimensional Distribution
, 1999
"... Suppose you are given some dataset drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S is bounded by some a priori specified between 0 and 1. We propo ..."
Abstract

Cited by 766 (29 self)
 Add to MetaCart
Suppose you are given some dataset drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S is bounded by some a priori specified between 0 and 1. We
A MaximumEntropyInspired Parser
, 1999
"... We present a new parser for parsing down to Penn treebank style parse trees that achieves 90.1% average precision/recall for sentences of length 40 and less, and 89.5% for sentences of length 100 and less when trained and tested on the previously established [5,9,10,15,17] "stan dard" se ..."
Abstract

Cited by 963 (19 self)
 Add to MetaCart
" sections of the Wall Street Journal tree bank. This represents a 13% decrease in error rate over the best singleparser results on this corpus [9]. The major technical innova tion is the use of a "maximumentropyinspired" model for conditioning and smoothing that let us successfully to test
A Maximum Entropy approach to Natural Language Processing
 COMPUTATIONAL LINGUISTICS
, 1996
"... The concept of maximum entropy can be traced back along multiple threads to Biblical times. Only recently, however, have computers become powerful enough to permit the widescale application of this concept to real world problems in statistical estimation and pattern recognition. In this paper we des ..."
Abstract

Cited by 1341 (5 self)
 Add to MetaCart
The concept of maximum entropy can be traced back along multiple threads to Biblical times. Only recently, however, have computers become powerful enough to permit the widescale application of this concept to real world problems in statistical estimation and pattern recognition. In this paper we
Estimating Continuous Distributions in Bayesian Classifiers
 In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence
, 1995
"... When modeling a probability distribution with a Bayesian network, we are faced with the problem of how to handle continuous variables. Most previous work has either solved the problem by discretizing, or assumed that the data are generated by a single Gaussian. In this paper we abandon the normality ..."
Abstract

Cited by 489 (2 self)
 Add to MetaCart
When modeling a probability distribution with a Bayesian network, we are faced with the problem of how to handle continuous variables. Most previous work has either solved the problem by discretizing, or assumed that the data are generated by a single Gaussian. In this paper we abandon
A Maximum Entropy Model for PartOfSpeech Tagging
, 1996
"... This paper presents a statistical model which trains from a corpus annotated with PartOfSpeech tags and assigns them to previously unseen text with stateoftheart accuracy(96.6%). The model can be classified as a Maximum Entropy model and simultaneously uses many contextual "features" t ..."
Abstract

Cited by 577 (1 self)
 Add to MetaCart
This paper presents a statistical model which trains from a corpus annotated with PartOfSpeech tags and assigns them to previously unseen text with stateoftheart accuracy(96.6%). The model can be classified as a Maximum Entropy model and simultaneously uses many contextual "
Discriminative Training and Maximum Entropy Models for Statistical Machine Translation
, 2002
"... We present a framework for statistical machine translation of natural languages based on direct maximum entropy models, which contains the widely used source channel approach as a special case. All knowledge sources are treated as feature functions, which depend on the source language senten ..."
Abstract

Cited by 497 (30 self)
 Add to MetaCart
We present a framework for statistical machine translation of natural languages based on direct maximum entropy models, which contains the widely used source channel approach as a special case. All knowledge sources are treated as feature functions, which depend on the source language
Results 1  10
of
770,502