MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

A Maximum Entropy Approach to Natural Language Processing (1996) [628 citations — 5 self]

by Adam L. Berger ,  Vincent J. Della Pietra ,  Vincent J. Della Pietra
Computational Linguistics
Add To MetaCart

Abstract:

The concept of maximum entropy can be traced back along multiple threads to Biblical times. Only recently, however, have computers become powerful enough to permit the widescale application of this concept to real world problems in statistical estimation and pattern recognition. In this paper we describe a method for statistical modeling based on maximum entropy. We present a maximum-likelihood approach for automatically constructing maximum entropy models and describe how to implement this approach efficiently, using as examples several problems in natural language processing.

Citations

4923 Elements of Information Theory – Cover, Thomas - 1991
4735 Maximum Likelihood from incomplete data via the EM algorithm – Dempster, Laird, et al. - 1977
549 The mathematics of statistical machine translation: Parameter estimation – Brown, Pietra, et al. - 1993
406 A statistical approach to machine translation – Brown, Cocke, et al. - 1990
396 Class-based n-gram models of natural language – BROWN, J, et al. - 1990
362 Inducing features of random fields – Pietra, Pietra, et al. - 1997
295 Generalized iterative scaling for log-linear models – Darroch, Ratcliff - 1972
230 Interpolated estimation of Markov source parameters from sparse data – Jelinek, Mercer - 1980
153 I-divergence geometry of probability distributions and minimization problems – Csiszár - 1975
136 Towards history-based grammars: Using richer models for probabilistic parsing – Black, Jelinek, et al. - 1993
96 Information geometry and alternating minimization procedures – Csisz'ar, Tusn'ady - 1984
93 A tree-based statistical language model for natural speech recognition – Bahl, Brown, et al. - 1990
48 An information theoretic approach to the automatic determination of phonemic baseforms – Lucassen, Mercer - 1984
47 Tagging text with a probabilistic model – Merialdo - 1994
39 The candide system for machine translation – Berger, Brown, et al. - 1994
28 Inference and estimation of a long-range trigram model – Pietra, Pietra, et al. - 1994
22 A Note on Approximations to Discrete Probability Distributions – Brown - 1959
18 The Principle of Maximum Entropy – Guiasu, Shenitzer - 1985
12 Inducing features of random elds – Pietra, Pietra, et al. - 1997
12 Notes on present status and future prospects – Jaynes - 1990
8 A statistical approach tomachine translation – Brown, Pietra, et al. - 1990
2 Pietra A Maximum Entropy Approach to NLP – Berger, Della - 1991
1 A geometric interpretation of Darroch and Ratcli 's generalized iterative scaling – ibid - 1989
1 Continuous speech recognition with automatically selected acoustic prototypes obtained by either bootstrapping or clustering – N'adas, Bahl, et al. - 1981
1 A Geometric Interpretation of Darroch and Ratcliff's Generalized Iterative Scaling. The Annals of Statistics – ibid - 1989
1 A Statistical Approach to Sense Disambiguation – Brown, Pietra, et al. - 1991
1 I-Divergence Geometry of Probability Distributions and Minimization Problems, The Annals of Probability – Csiszdr - 1975
1 Information Geometry and Alternating Minimization Procedures – Csiszir, Tusnidy - 1984