MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data (2001) [848 citations — 45 self]

Abstract:

We present Conditional Random Fields, a framework for building probabilistic models to segment and label sequence data. Conditional random fields offer several advantages over hidden Markov models and stochastic grammars for such tasks, including the ability to relax strong independence assumptions made in those models. Conditional random fields also avoid a fundamental limitation of maximum entropy Markov models (MEMMs) and other discriminative Markov models based on directed graphical models, which can be biased towards states with few successor states. We present iterative parameter estimation algorithms for conditional random fields and compare the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.

Citations

1205 Schapire, “Decision-theoretic generalization of on-line learning and application to boosting – Freund, E - 1997
628 A Maximum Entropy Approach to Natural Language Processing – Berger, Pietra, et al. - 1996
566 Transformation-based error-driven learning and natural language processing: A case study in part-of-speech tagging – Brill - 1995
441 Biological sequence analysis—- Probabilistic models of proteins and nucleic acids. Combridge – Durbin, Eddy, et al. - 1998
362 Inducing features of random fields – Pietra, Pietra, et al. - 1997
344 Foundations of statistical natural language processing – Manning, Schutze - 1999
295 Generalized iterative scaling for log-linear models – Darroch, Ratcliff - 1972
273 P.: Gradient-based learning applied to document recognition – LeCun, Bottou, et al. - 1998
259 Maximum entropy markov models for information extraction and segmentation – McCallum, Freitag, et al. - 2000
241 A maximum entropy model for part-of-speech tagging – Ratnaparkhi - 1996
206 Finite-state transducers in language and speech processing – Mohri - 1997
145 Discriminative re-ranking for natural language parsing – Collins
130 Learning to resolve natural language ambiguities: a unified approach – Roth - 1998
99 Introduction to probabilistic automata – Paz - 1971
69 Information extraction with hmm structures learned by stochastic optimization – Freitag - 2000
51 Markov field and finite graphs and lattices. unpublished – Hammersley, Clifford - 1971
50 Boosting applied to tagging and PP attachment – Abney, Schapire, et al. - 1999
47 Boltzmann chains and hidden Markov models – Saul, Jordan - 1995
45 Une Approche théorique de l’Apprentissage Connexionniste: Applications à la Reconnaissance de la Parole – Bottou - 1991
42 Minimization algorithms for sequential transducers – Mohri - 2000
23 A whole sentence maximum entropy language model – Rosenfeld - 1997
3 Equivalence of linear Boltzmann chains and hidden Markov models – MacKay - 1996
2 The use of classifiers in sequential inference. NIPS 13. Forthcoming – Punyakanok - 2001