## Expected-Frequency Interpolation (1996)

Citations: | 9 - 1 self |

### BibTeX

@TECHREPORT{Charniak96expected-frequencyinterpolation,

author = {Eugene Charniak and Eugene Charniak},

title = {Expected-Frequency Interpolation},

institution = {},

year = {1996}

}

### OpenURL

### Abstract

Expected-frequency interpolation is a technique for improving the performance of deleted interpolation smoothing. It allows a system to make finer-grained estimates of how often one would expect to see a particular combination of events than is possible with traditional frequency interpolation. This allows the system to better weigh the emphasis given to the various probablity distributions being mixed. We show that more traditional frequency interpolation, based solely on the frequency of conditioning events, can lead to some anomalous results. We then show that while the equations for expected-frequency interpolation are not exact, they are close, depending on how well some seemingly reasonable assumptions hold. We then present an experiment in which the introduction of expected-frequence interpolation to a statistical parsing system improved performance by .4% with essentially no extra work, and essentially no change in the workings of the system. We also note that even before the c...

### Citations

2179 | Building a large annotated corpus of English: The Penn treebank
- Marcus, Santorini, et al.
- 1993
(Show Context)
Citation Context ...re are also a lot of very low probability items like "31.65," things we typically do not find modifying "January." 10 slightly under one million words of the Penn tree-bank Wall St=-=reet Journal corpus [5]-=-. It was tested on another 50,000 words of held out testing data. These are all of the sentences in one sub-file of the corpus restricting consideration to those of length less than or equal to 40 wor... |

444 | A new statistical parser based on bigram lexical dependencies
- Collins
- 1996
(Show Context)
Citation Context ...tatistically significant effect on the measures we do care about. In the following table we give the results of our two runs, along with the best previous results on this exact test, those of Collins =-=[3]-=-. Labeled Labeled System Precision Recall Frequency Interpolation 86.7% 86.4% Expected-Frequency Interpolation 87.1% 86.8% Collins 86.3% 85.8% 3 Actually, to allow comparison to previous work, we use ... |

371 | Statistical Parsing with a Context-Free Grammar and
- Charniak
- 1997
(Show Context)
Citation Context ...nterpolation can lead to anomalous results in many cases. Section three outlines our solution to this problem, while section four gives some empirical results. 2 Frequency Interpolation in Parsing In =-=[2]-=- we describe a program that parses using a language model. The program assigns every sentence a probability that is the sum of the probabilities of all of the sentence's possible parses (or at least a... |

296 | Structural ambiguity and lexical relations
- Hindle, Rooth
- 1993
(Show Context)
Citation Context ...whereas in the parse in which the pp is attached to "started" the relevant probability is p(about j started). The reader might note that these probabilities are the same ones used by Hindle =-=and Rooth [4]-=- in their early work on pp attachment. More generally we say that the probability of the (sub)head s of a phrase of type t (e.g., pp) according to a parse in which the phrase is attached to a parent h... |

253 | Tagging English text with a probabilistic model
- Merialdo
- 1994
(Show Context)
Citation Context ...ll smoothing equations of this sort frequency interpolation. A good example of frequency interpolation can be found in [1]. An example of deleted interpolation used for tagging models can be found in =-=[6]-=-, which also discusses the use of frequency interpolation. While deleted interpolation, and more specifically, frequency interpolation works reasonably well for trigram language modeling, we suspect t... |

67 |
An estimate of an upper bound for the entropy of English
- Brown, Pietra, et al.
- 1992
(Show Context)
Citation Context ...ly interpolating several probabilities, each one deleting some of the conditioning events of the probability to be estimated. The canonical example of this occurs in trigram models of language (e.g., =-=[1]-=- ), where the basic probability is the probability of a word given the two previous words in the text, p(w i j w i\Gamma2 ; w i\Gamma1 ), where w i is the ith word in the text. Since the number of wor... |