• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Two decades of statistical language modeling: Where do we go from here (2000)

Cached

  • Download as a PDF

Download Links

  • [ima.umn.edu]
  • [www.ima.umn.edu]
  • [www.cs.cmu.edu]
  • [www.cs.cmu.edu]
  • [www-koi.compression.ru]
  • [www.compression.ru]
  • [www-win.compression.ru]
  • [compression.graphicon.ru]
  • [ciir.cs.umass.edu]
  • [www.ima.umn.edu]

  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Ronald Rosenfeld
Venue:Proceedings of the IEEE
Citations:119 - 1 self
  • Summary
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@INPROCEEDINGS{Rosenfeld00twodecades,
    author = {Ronald Rosenfeld},
    title = {Two decades of statistical language modeling: Where do we go from here},
    booktitle = {Proceedings of the IEEE},
    year = {2000},
    pages = {2000}
}

Years of Citing Articles

Bookmark

citeulike Connotea Bibsonomy Del.icio.us Digg Reddit

OpenURL

 

Abstract

Statistical Language Models estimate the distribution of various natural language phenomena for the purpose of speech recognition and other language technologies. Since the first significant model was proposed in 1980, many attempts have been made to improve the state of the art. We review them here, point to a few promising directions, and argue for a Bayesian approach to integration of linguistic theories with data. 1. OUTLINE Statistical language modeling (SLM) is the attempt to capture regularities of natural language for the purpose of improving the performance of various natural language applications. By and large, statistical language modeling amounts to estimating the probability distribution of various linguistic units, such as words, sentences, and whole documents. Statistical language modeling is crucial for a large variety of language technology applications. These include speech recognition (where SLM got its start), machine translation, document classification and routing, optical character recognition, information retrieval, handwriting recognition, spelling correction, and many more. In machine translation, for example, purely statistical approaches have been introduced in [1]. But even researchers using rule-based approaches have found it beneficial to introduce some elements of SLM and statistical estimation [2]. In information retrieval, a language modeling approach was recently proposed by [3], and a statistical/information theoretical approach was developed by [4]. SLM employs statistical estimation techniques using language training data, that is, text. Because of the categorical nature of language, and the large vocabularies people naturally use, statistical techniques must estimate a large number of parameters, and consequently depend critically on the availability of large amounts of training data.

Citations

4404 The Mathematical Theory of Communication - Shannon, Weaver - 1963
3143 Classification and Regression Trees - Breiman, Friedman, et al. - 1984
2168 Indexing by latent semantic analysis - Deerwester, Dumais, et al. - 1990
1654 B.: Building a large annotated corpus of english: The Penn treebank - Marcus, Marcinkiewicz, et al. - 1993
846 A Maximum Entropy Approach to Natural Language Processing - Berger, Pietra, et al. - 1996
684 A language modeling approach to information retrieval - Ponte, Croft - 1998
631 J: An Empirical Study of Smoothing Techniques for Language Modeling - SF, Goodman - 1996
612 Statistical Methods for Speech Recognition - Jelinek - 1997
574 Estimation of probabilities from sparse data for the language model component of a speech recognizer - Katz - 1987
540 Class-based n-gram models of natural language - Brown, Pietra, et al. - 1992
502 A statistical approach to machine translation - Brown, Cocke, et al. - 1990
464 Inducing Features of Random Fields - Pietra, Pietra, et al. - 1997
448 Information theory and statistical mechanics - Jaynes - 1957
411 SWITCHBOARD: Telephone speech corpus for research and development - Godfrey, Holliman, et al. - 1992
396 New Statistical Parser Based on Bigram Lexical De-pendencies - Collins - 2006
355 Generalized Iterative Scaling for Log-Linear Models - Darroch, Ratcliff - 1972
313 Parsing english with a link grammar - Sleator, Temperley - 1993
302 Self-organized language modeling for speech recognition - Jelinek - 1990
286 The population frequencies of species and the estimation of population parameters - Good - 1953
286 Interpolated estimation of Markov source parameters from sparse data - Jelinek, Mercer - 1980
279 Prediction and entropy of printed English - Shannon - 1951
264 Statistical language modeling using the cmucambridge toolkit - Clarkson, Rosenfeld - 1997
232 Trainable Grammars for Speech Recognition - Baker - 1979
220 Information retrieval as statistical translation - Berger, Lafferty - 1999
201 A maximum entropy approach to adaptive statistical language modelling - Rosenfeld - 1996
195 The zero-frequency problem: estimating the probabilities of novel events in adaptive text compression - Witten, Bell - 1991
179 Improved backing-off for m-gram language modeling - Kneser, Ney - 1995
149 On structuring probabilistic dependencies in stochastic language modeling. Computer Speech and Language - Ney, Essen, et al. - 1994
142 A cache-based natural language model for speech recognition - Kuhn, Mori - 1990
132 The Design for the Wall Street Journal-based CSR Corpus - Paul, Baker - 1992
106 A tree-based statistical language model for natural language speech recognition - Bahl, Brown, et al. - 1989
105 Basic methods of probabilistic context-free grammars - Jelinek, Lafferty, et al. - 1992
83 Evaluation of Spoken Language Systems: the ATIS Domain - Price - 1990
82 Improved clustering techniques for class-based statistical language modelling - Kneser, Ney - 1993
82 Two experiments on learning probabilistic dependency grammars from corpora - Carroll, Charniak - 1992
79 Grammatical trigrams: a probabilistic model of link grammar - Lafferty, Sleator, et al. - 1992
77 Modeling long distance dependence in language: Topic mixtures vs. dynamic cache models - Iyer, Ostendorf - 1996
75 2000. “A Survey of Smoothing Techniques for ME Models - Chen, Rosenfeld
69 Tishby N: The Power of Amnesia - Ron, Singer
68 Trigger-Based Language Models: a Maximum Entropy Approach - Lau, Rosenfeld, et al. - 1993
62 The CMU statistical language modeling toolkit and its use in the 1994 ARPA CSR evaluation - Rosenfeld - 1995
48 A convergent gambling estimate of the entropy of English - Cover, King - 1978
48 Using story topics for language model adaptation - Seymore, Rosenfeld - 1997
44 Design of a linguistic postprocessor using variable memory length Markov models - Guyon, Pereira - 1995
43 The CMU air travel information service: Understanding spontaneous speech - Ward - 1990
41 A Model of Lexical Attraction and Repulsion - Beeferman, Berger, et al.
31 Adaptive language modeling using minimum discriminant estimation - Pietra, Pietra, et al. - 1992
29 Evaluation metrics for language models - Chen, Beeferman, et al. - 1998
28 On the dynamic adaptation of stochastic language models - Kneser, Steinbiss - 1993
27 A multispan language modeling framework for large vocabulary speech recognition - Bellegarda - 1998
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University