Results 1  10
of
3,782,336
Minimum Error Rate Training in Statistical Machine Translation
, 2003
"... Often, the training procedure for statistical machine translation models is based on maximum likelihood or related criteria. A general problem of this approach is that there is only a loose relation to the final translation quality on unseen text. In this paper, we analyze various training cri ..."
Abstract

Cited by 682 (7 self)
 Add to MetaCart
Often, the training procedure for statistical machine translation models is based on maximum likelihood or related criteria. A general problem of this approach is that there is only a loose relation to the final translation quality on unseen text. In this paper, we analyze various training
Symmetry and Related Properties via the Maximum Principle
, 1979
"... We prove symmetry, and some related properties, of positive solutions of second order elliptic equations. Our methods employ various forms of the maximum principle, and a device of moving parallel planes to a critical position, and then showing that the solution is symmetric about the limiting plan ..."
Abstract

Cited by 525 (4 self)
 Add to MetaCart
We prove symmetry, and some related properties, of positive solutions of second order elliptic equations. Our methods employ various forms of the maximum principle, and a device of moving parallel planes to a critical position, and then showing that the solution is symmetric about the limiting
A MaximumEntropyInspired Parser
, 1999
"... We present a new parser for parsing down to Penn treebank style parse trees that achieves 90.1% average precision/recall for sentences of length 40 and less, and 89.5% for sentences of length 100 and less when trained and tested on the previously established [5,9,10,15,17] "stan dard" se ..."
Abstract

Cited by 964 (19 self)
 Add to MetaCart
" sections of the Wall Street Journal tree bank. This represents a 13% decrease in error rate over the best singleparser results on this corpus [9]. The major technical innova tion is the use of a "maximumentropyinspired" model for conditioning and smoothing that let us successfully to test
Maximum entropy markov models for information extraction and segmentation
, 2000
"... Hidden Markov models (HMMs) are a powerful probabilistic tool for modeling sequential data, and have been applied with success to many textrelated tasks, such as partofspeech tagging, text segmentation and information extraction. In these cases, the observations are usually modeled as multinomial ..."
Abstract

Cited by 554 (17 self)
 Add to MetaCart
Hidden Markov models (HMMs) are a powerful probabilistic tool for modeling sequential data, and have been applied with success to many textrelated tasks, such as partofspeech tagging, text segmentation and information extraction. In these cases, the observations are usually modeled
Error and attack tolerance of complex networks
, 2000
"... Many complex systems display a surprising degree of tolerance against errors. For example, relatively simple organisms grow, persist and reproduce despite drastic pharmaceutical or environmental interventions, an error tolerance attributed to the robustness of the underlying metabolic network [1]. C ..."
Abstract

Cited by 981 (6 self)
 Add to MetaCart
Many complex systems display a surprising degree of tolerance against errors. For example, relatively simple organisms grow, persist and reproduce despite drastic pharmaceutical or environmental interventions, an error tolerance attributed to the robustness of the underlying metabolic network [1
Boosting the margin: A new explanation for the effectiveness of voting methods
 IN PROCEEDINGS INTERNATIONAL CONFERENCE ON MACHINE LEARNING
, 1997
"... One of the surprising recurring phenomena observed in experiments with boosting is that the test error of the generated classifier usually does not increase as its size becomes very large, and often is observed to decrease even after the training error reaches zero. In this paper, we show that this ..."
Abstract

Cited by 885 (52 self)
 Add to MetaCart
that this phenomenon is related to the distribution of margins of the training examples with respect to the generated voting classification rule, where the margin of an example is simply the difference between the number of correct votes and the maximum number of votes received by any incorrect label. We show
Quantal Response Equilibria For Normal Form Games
 NORMAL FORM GAMES, GAMES AND ECONOMIC BEHAVIOR
, 1995
"... We investigate the use of standard statistical models for quantal choice in a game theoretic setting. Players choose strategies based on relative expected utility, and assume other players do so as well. We define a Quantal Response Equilibrium (QRE) as a fixed point of this process, and establish e ..."
Abstract

Cited by 632 (28 self)
 Add to MetaCart
existence. For a logit specification of the error structure, we show that as the error goes to zero, QRE approaches a subset of Nash equilibria and also implies a unique selection from the set of Nash equilibria in generic games. We fit the model to a variety of experimental data sets by using maximum
Fit indices in covariance structure modeling: Sensitivity to underparameterized model misspecification
 Psychological Methods
, 1998
"... This study evaluated the sensitivity of maximum likelihood (ML), generalized least squares (GLS), and asymptotic distributionfree (ADF)based fit indices to model misspecification, under conditions that varied sample size and distribution. The effect of violating assumptions of asymptotic robustn ..."
Abstract

Cited by 516 (0 self)
 Add to MetaCart
This study evaluated the sensitivity of maximum likelihood (ML), generalized least squares (GLS), and asymptotic distributionfree (ADF)based fit indices to model misspecification, under conditions that varied sample size and distribution. The effect of violating assumptions of asymptotic
Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods
 ADVANCES IN LARGE MARGIN CLASSIFIERS
, 1999
"... The output of a classifier should be a calibrated posterior probability to enable postprocessing. Standard SVMs do not provide such probabilities. One method to create probabilities is to directly train a kernel classifier with a logit link function and a regularized maximum likelihood score. Howev ..."
Abstract

Cited by 1027 (0 self)
 Add to MetaCart
. However, training with a maximum likelihood score will produce nonsparse kernel machines. Instead, we train an SVM, then train the parameters of an additional sigmoid function to map the SVM outputs into probabilities. This chapter compares classification error rate and likelihood scores for an SVM plus
Results 1  10
of
3,782,336