### TABLE I Asymptotic and finite convergence properties of the

1994

Cited by 11

### Table 1 The duality of maximum entropy and maximum likelihood is an example of the more general

1996

"... In PAGE 10: ... This result provides an added justi cation for the maximum entropy principle: if the notion of selecting a model p ? on the basis of maximum entropy isn apos;t compelling enough, it so happens that this same p ? is also the model which, from among all models of the same parametric form (10), can best account for the training sample. Table1 summarizes the primal-dual framework wehave established. 3.... ..."

Cited by 614

### Table 5 Maximum entropy model to predict French translation of in.Features shown here were the

1996

"... In PAGE 22: ... The automatic feature selection algorithm rst selected a template 1 constraint for each of the translations of in seen in the sample (12 in all), thus constraining the model apos;s expected probabilityofeach of these translations to their empirical probabilities. The next few constraints selected by the algorithm are shown in Table5 . The rst column gives the identity of the feature whose expected value is constrained;; the second column gives L(S;;f), the approximate increase in the model apos;s log-likelihood on the data as a result of imposing this constraint;; the third column gives L(p), the log-likelihood after adjoining the feature and recomputing the model.... ..."

Cited by 614

### Table 1: Results on 5-class toy dataset. err: Error on test set. lh: Avg. likelihood test set. minent: Smallest entropy pred. distribution. maxent: Largest entropy pred. distribution.

2004

"... In PAGE 24: ... We reserve the issue of automatic model selection by empirical Bayesian methods for future work, which could certainly handle the case of independently parameterized K(c) covariance functions. Table1 gives test set results for the nal predictor (jIj = dfinal). The gures are the test set error err and the predictive probability of the true label lh, we also provide the maximum and minimum entropy of the predictive distributions over the test set.... ..."

Cited by 5

### Table 5: Model speci cation summary for estimation of GAFT duration speci cations: optimized log-likelihood, Hannan-Quinn (HQ), Akaike (AIC), and Schwarz (BIC) model speci cation criteria.

"... In PAGE 21: ... The mean of the distribution is restricted to be 0 and the estimated variance is approximately 5.16 The left panel of Table5 presents optimized log likelihood values for the various speci cations, and information criteria to aid in selection of the number of terms in the discrete mixture. In- terestingly, the model selection criteria for the NPMLE in Table 5 show the standard pattern; the Schwarz (BIC) and Hannan-Quinn (HQ) criteria both select the two-point speci cation, while the Akaike (AIC) opts for a larger three-point mixture model.... In PAGE 21: ...16 The left panel of Table 5 presents optimized log likelihood values for the various speci cations, and information criteria to aid in selection of the number of terms in the discrete mixture. In- terestingly, the model selection criteria for the NPMLE in Table5 show the standard pattern; the Schwarz (BIC) and Hannan-Quinn (HQ) criteria both select the two-point speci cation, while the Akaike (AIC) opts for a larger three-point mixture model. For reasons that will become clear shortly, I will work primarily with the three term NPMLE and SNP speci cations in the remainder of the discussion In contrast to the straightforward interpretation of the probabilities and support in the NPMLE estimates of F , the coe cients of the SNP presented in Table 6 are more di cult to interpret.... In PAGE 22: ... It is also worth noting that the variances of the estimated mixing distribution roughly coincide for the three point of NPMLE and the three term SNP. The casual observation that the M = 3 expansion appears to be the best SNP speci cation is supported by both model selection criteria ( Table5 ) and visual inspection of the empirical Bayes estimates of the mixing distribution.17 As might be expected, the AIC selects a large model containing three terms in the expansion (as does the Hannan-Quinn criterion).... ..."

### Table 6: Finite Sample Distribution of T

"... In PAGE 13: ... In Table 5 we present the nite sample means and sample variances of T n under H 0 for all three samples. We report the 95% critical value in Table6 and the power of the test in Table 7. We also perform a Monte Carlo study to obtain the real sizes of the test in nite samples and compare them with the nominal sizes.... In PAGE 13: ... A detailed examination of Table 3 and Table 5 reveals that the asymptotic distri- bution of T n is very close to the nite sample distribution of T n across all three samples and all nite-variance distributions. Not surprisingly, therefore, we end up the same conclusions from Table 4 and Table6 . Table 4 indicates that, for all three samples,... In PAGE 14: ... Finally, Table 7 provides the evidence that in nite samples our test has very good power. From Table6 andFigure 1, we note that in terms of the size of the test, it works quite well for the normal distribution, the Student t distribution, the mixture of normal distribution, the compound log-normal and normal model, and the Weibull distribution. Although the size distortions are larger for the mixed di usion jump model, the biases suggest under-rejection of the model and hence support our nding of rejection of all nite-variance distributions in the above empirical study.... ..."

### Table 1: Parameter Estimates for the Coordinating and Non-Coordinating Models

"... In PAGE 28: ... For 1994 and 1998 there is a signi#0Ccant tendency for electors who have higher values of #12 i to be more likely to vote than electors who havelower values of #12 i : conservative electors were especially mobilized in those two elections. *** Table1 about here *** In every year, the coordinating model passes the parameter-based tests of the conditions neces- sary for it to describe coordinating behavior. Table 2 reports the LR test statistics for the constraint #0B = 1, imposed separately for eachyear.... In PAGE 29: ... The House position was expected to be closer to the Democratic position in 1978, 1982, 1986 and 1990, closer to the Republican position in 1994 and 1998. The MLEs for #0B in the coordinating model are less than :5inevery year except one #28see Table1 #29, suggesting that electors expected the Presidenttobeweaker than the House in determining post-midterm policy. *** Table 4 about here *** The distribution of the ordering of electors apos; ideal points with respect to the post-election policies electors expect according to the coordinating model shows that the moderating mechanism of the coordinating model is capable of generating a midterm cycle of the kind emphasized by Alesina and Rosenthal #281989; 1995#29, though it need not do so.... In PAGE 37: ... NES survey respondents mayoverreport the frequency with which they vote. Among the 9,639 cases from years 1978#7B98 that we use to compute the parameter estimates reported in Table1 , the ! i -weighted percentage reporting having voted is, by year: 47.... In PAGE 38: ... 19. Table1 shows #0B 90 , #0B 94 , #1A 78 , #1A 86 , #1A 90 and #1A 98 to have MLEs equal to either 0:0or1:0, on the conceptual boundary of the parameter space. Consequently, the asymptotic distributions of the MLEs and the LR test statistics are complicated #28Moran 1971; Self and Liang 1987#29.... In PAGE 39: ...Table1 to tabulate that mixture distribution and estimate the con#0Cdence intervals of Table 3. 20.... In PAGE 48: ...524 .455 Note: Computed using the parameter MLEs in Table1 and 1978#7B98 ANES data. Table 5: Orderings of Ideal Points and Expected PartyPolicy Positions, byYear Ordering year #12 i #3C ~ #12 Mi ; ~ #12 i ~ #12 Mi #3C#12 i #3C ~ #12 i ~ #12 i #3C#12 i #3C ~ #12 Mi ~ #12 Mi ; ~ #12 i #3C#12 i #12 Di = #12 Ri amp; i =0 1978 19.... In PAGE 48: ... Entries show the percentage of electors in eachyear who have #12 Di #3C#12 Ri and the indicated ordering of ideal point and expected policy positions, or who have #12 Di = #12 Ri , or who lack policy position values #28 amp; i = 0#29. Computed using the parameter MLEs in Table1 and 1978#7B98 ANES data. Percentages for those with #12 Di #3E#12 Ri are, byyear: #12 i #3C ~ #12 i #285.... ..."

### Table 5 Maximum entropy model to predict French translation of in. Features shown here were the rst non template 1 features selected. [verb marker] denotes a morphological marker inserted to indicate the presence of a verb as the next word.

1996

"... In PAGE 22: ... The automatic feature selection algorithm rst selected a template 1 constraint for each of the translations of in seen in the sample (12 in all), thus constraining the model apos;s expected probability of each of these translations to their empirical probabilities. The next few constraints selected by the algorithm are shown in Table5 . The rst column gives the identity of the feature whose expected value is constrained; the second column gives L(S; f), the approximate increase in the model apos;s log-likelihood on the data as a result of imposing this constraint; the third column gives L(p), the log-likelihood after adjoining the feature and recomputing the model.... ..."

Cited by 614

### Table 1: Selected previous strategies for state aggregation

2006

"... In PAGE 3: ... However, if such abstract MDPs are used to learn a policy for the larger MDP, they may not yield optimal policies [14, 18] and may even prevent some algorithms from converging [12]. A summary of the properties of the aforemen- tioned work is presented in Table1 . The table or- ders the algorithms roughly from strictest to coars- est abstractions.... ..."

Cited by 8

### Table 9. Negative log likelihood fits of various model components for BSAI POP models with a double logistic fishery selectivity curve (Model 1), a double logistic fishery selectivity curve that penalize non- smoothness (Model 2), and a asymptotic fishery selectivity curve (Model 3).

"... In PAGE 11: ... In Model 3, the asymptotic logistic curve was used for selectivity. The likelihood components for the three models are shown in Table9 , and the estimated fishery and survey selectivities are shown in Figure 1. The double logistic curve produces the lowest negative log-likelihood, but marked by a sharply declining curve at older ages.... ..."