• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

A bayesian interpretation of interpolated kneserney (2006)

Cached

  • Download as a PDF

Download Links

  • [www.gatsby.ucl.ac.uk]
  • [www.ece.duke.edu]
  • [dl.comp.nus.edu.sg]
  • [www.gatsby.ucl.ac.uk]
  • [www.eecs.berkeley.edu]
  • [www.cs.berkeley.edu]

  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Yee Whye Teh
Citations:8 - 2 self
  • Summary
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@TECHREPORT{Teh06abayesian,
    author = {Yee Whye Teh},
    title = {A bayesian interpretation of interpolated kneserney},
    institution = {},
    year = {2006}
}

Bookmark

citeulike Connotea Bibsonomy Del.icio.us Digg Reddit

OpenURL

 

Abstract

Interpolated Kneser-Ney is one of the best smoothing methods for n-gram language models. Previous explanations for its superiority have been based on intuitive and empirical justifications of specific properties of the method. We propose a novel interpretation of interpolated Kneser-Ney as approximate inference in a hierarchical Bayesian model consisting of Pitman-Yor processes. As opposed to past explanations, our interpretation can recover exactly the formulation of interpolated Kneser-Ney, and performs better than interpolated Kneser-Ney when a better inference procedure is used. 1

Citations

887 D: Bayesian Data Analysis - Gelman, Carlin, et al. - 1995
846 A Maximum Entropy Approach to Natural Language Processing - Berger, Pietra, et al. - 1996
631 J: An Empirical Study of Smoothing Techniques for Language Modeling - SF, Goodman - 1996
355 Maximum entropy Markov models for information extraction and segmentation - McCallum, Freitag, et al. - 2000
328 Hierarchical Dirichlet processes - Teh, Jordan, et al. - 2006
216 A constructive definition of Dirichlet priors - Sethuraman - 1994
179 Improved backing-off for m-gram language modeling - Kneser, Ney - 1995
162 The two-parameter PoissonDirichlet distribution derived from a stable subordinator. Annals of Probability - Pitman, Yor - 1997
160 Gibbs sampling methods for stick–breaking priors - Ishwaran, James - 2001
154 Adaptive Statistical Language Modeling: A Maximum Entropy Approach - Rosenfeld - 1994
119 A comparison of the enhanced Good-Turing and deleted estimation methods for estimating probabilities of English bigrams. Computer Speech and Language - Church, Gale - 1991
119 Two decades of statistical language modeling: Where do we go from here - Rosenfeld
82 Combinatorial Stochastic Processes - Pitman - 2006
81 A neural probabilistic language model - Bengio, Ducharme, et al.
75 2000. “A Survey of Smoothing Techniques for ME Models - Chen, Rosenfeld
70 A bit of progress in language modeling - Goodman - 2001
61 Factored language models and generalized parallel backoff - Bilmes, Kirchhoff - 2003
59 Structure Learning in conditional probability models via an entropic prior and parameter extinction - Brand
56 Interpolating between types and tokens by estimating power-law generators - Goldwater, Griffiths, et al. - 2006
42 Exponential priors for maximum entropy models - Goodman - 2004
39 Offline recognition of unconstrained handwritten texts using HMMs and statistical language models - Vinciarelli, Bengio, et al. - 2004
18 A hierarchical Dirichlet language model. Natural language engineering - MacKay, Peto - 1995
15 A unified approach to generalized Stirling numbers - Hsu, Shiue - 1998
10 F.: Distributed latent variable models of lexical co-occurrences - Blitzer, Globerson, et al. - 2005
10 Random forests in language modeling - Xu, Jelinek - 2004
9 Dirichlet processes, Chinese restaurant processes and all that. Tutorial presented at NIPS conference - Jordan - 2005
8 Immediate Head Parsing for Language Models - Charniak - 2001
2 Conditional random fields: Propabilistic models for segmenting and labeling sequence data - Lafferty, McCallum, et al. - 2001
1 a model for words have been used in, e.g. Vinciarelli et al - Such - 2004
1 Two decades of statistical language modeling: Where do we go from here - unknown authors - 2000
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University