Results 1 -
4 of
4
Minimum Error Rate Training in Statistical Machine Translation
, 2003
"... Often, the training procedure for statistical machine translation models is based on maximum likelihood or related criteria. A general problem of this approach is that there is only a loose relation to the final translation quality on unseen text. In this paper, we analyze various training cri ..."
Abstract
-
Cited by 282 (5 self)
- Add to MetaCart
Often, the training procedure for statistical machine translation models is based on maximum likelihood or related criteria. A general problem of this approach is that there is only a loose relation to the final translation quality on unseen text. In this paper, we analyze various training criteria which directly optimize translation quality.
Whole-Sentence Exponential Language Models: A Vehicle for Linguistic-Statistical Integration
- Computers, Speech and Language
, 2001
"... We introduce an exponential language model which models a whole sentence or utterance as a single unit. By avoiding the chain rule, the model treats each sentence as a "bag of features", where features are arbitrary computable properties of the sentence. The new model is computationally more effici ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
We introduce an exponential language model which models a whole sentence or utterance as a single unit. By avoiding the chain rule, the model treats each sentence as a "bag of features", where features are arbitrary computable properties of the sentence. The new model is computationally more efficient, and more naturally suited to modeling global sentential phenomena, than the conditional exponential (e.g. Maximum Entropy) models proposed to date. Using the model is straightforward. Training the model requires sampling from an exponential distribution. We describe the challenge of applying Monte Carlo Markov Chain (MCMC) and other sampling techniques to natural language, and discuss smoothing and step-size selection. We then present a novel procedure for feature selection, which exploits discrepancies between the existing model and the training corpus. We demonstrate our ideas by constructing and analyzing competitive models in the Switchboard domain, incorporating lexical and syntact...
Maximum Expected F-Measure Training of Logistic Regression Models
, 2005
"... We consider the problem of training logistic regression models for binary classification in information extraction and information retrieval tasks. Fitting probabilistic models for use with such tasks should take into account the demands of the taskspecific utility function, in this case the ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
We consider the problem of training logistic regression models for binary classification in information extraction and information retrieval tasks. Fitting probabilistic models for use with such tasks should take into account the demands of the taskspecific utility function, in this case the well-known F-measure, which combines recall and precision into a global measure of utility. We develop a training procedure based on empirical risk minimization / utility maximization and evaluate it on a simple extraction task.

