• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

An Empirical Study of Smoothing Techniques for Language Modeling (1998)

Cached

  • Download as a PDF

Download Links

  • [l2r.cs.uiuc.edu]
  • [acl.ldc.upenn.edu]
  • [arxiv.org]
  • [research.microsoft.com]
  • [www.cs.cmu.edu]
  • [www.cs.cmu.edu]
  • [nlp.postech.ac.kr]
  • [www.isip.msstate.edu]

  • Other Repositories/Bibliography

  • DBLP
  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Stanley F. Chen
Citations:631 - 19 self
  • Summary
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@TECHREPORT{Chen98anempirical,
    author = {Stanley F. Chen},
    title = {An Empirical Study of Smoothing Techniques for Language Modeling},
    institution = {},
    year = {1998}
}

Years of Citing Articles

Bookmark

citeulike Connotea Bibsonomy Del.icio.us Digg Reddit

OpenURL

 

Abstract

We present an extensive empirical comparison of several smoothing techniques in the domain of language modeling, including those described by Jelinek and Mercer (1980), Katz (1987), and Church and Gale (1991). We investigate for the first time how factors such as training data size, corpus (e.g., Brown versus Wall Street Journal), and n-gram order (bigram versus trigram) affect the relative performance of these methods, which we measure through the cross-entropy of test data. In addition, we introduce two novel smoothing techniques, one a variation of Jelinek-Mercer smoothing and one a very simple linear interpolation technique, both of which outperform existing methods. 1

Citations

649 A stochastic parts program and noun phrase parser for unrestricted text - Church - 1988
574 Estimation of probabilities from sparse data for the language model component of a speech recognizer - Katz - 1987
502 A statistical approach to machine translation - Brown, Cocke, et al. - 1990
352 H: Theory of probability - Jeffreys - 1961
345 A maximum likelihood approach to continuous speech recognition - Bahl, Jelinek - 1983
286 The population frequencies of species and the estimation of population parameters - Good - 1953
286 Interpolated estimation of Markov source parameters from sparse data - Jelinek, Mercer - 1980
136 Natural Language Parsing as Statistical Pattern Recognition - Magerman - 1994
122 Prepositional Phrase Attachment through a Backed-off Model - Collins, Brooks - 1995
119 A comparison of the enhanced Good-Turing and deleted estimation methods for estimating probabilities of English bigrams. Computer Speech and Language - Church, Gale - 1991
66 A hierarchical Dirichlet language model - MacKay, Peto - 1994
63 A Spelling Correction Program Based on a Noisy Channel Model - Kernighan, Church, et al. - 1990
60 An estimate of an upper bound for the entropy of English - Brown, Pietra, et al. - 1992
60 Building Probabilistic Models for Natural Language - Chen - 1996
50 G: Good-Turing Frequency Estimation Without Tears - WA, Sampson - 1995
33 Estimation of probabilities in the language model of the IBM speech recognition system - Nadas - 1984
30 A statistical approach to machine translation - Lafferty, Roossin - 1990
30 Note on the general case of the Bayes-Laplace formula for inductive or a posteriori probabilities - Lidstone - 1920
25 Probability: deductive and inductive problems - Johnson - 1932
22 What’s wrong with adding one - Gale, Church - 1994
18 A hierarchical Dirichlet language model. Natural language engineering - MacKay, Peto - 1995
7 A maximum likelihood approach to continuous speech recognition - Mercer - 1983
7 Estimation procedures for language context: poor estimates are worse than none - Gale, Church - 1990
1 What’s wrong with one - Gale, Church - 1994
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University