MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

Maximum Entropy Markov Models for Information Extraction and Segmentation (2000) [263 citations — 15 self]

Abstract:

Hidden Markov models (HMMs) are a powerful probabilistic tool for modeling sequential data, and have been applied with success to many text-related tasks, such as part-of-speech tagging, text segmentation and information extraction. In these cases, the observations are usually modeled as multinomial distributions over a discrete vocabulary, and the HMM parameters are set to maximize the likelihood of the observations. This paper presents a new Markovian sequence model, closely related to HMMs, that allows observations to be represented as arbitrary overlapping features (such as word, capitalization, formatting, part-of-speech), and defines the conditional probability of state sequences given observation sequences. It does this by using the maximum entropy framework to fit a set of exponential models that represent the probability of a state given an observation and the previous state. We present positive experimental results on the segmentation of FAQ's.

Citations

2397 A tutorial on hidden markov models and selected applications in speech recognition – Rabiner - 1989
367 Inducing features of random fields – Pietra, Pietra, et al. - 1997
296 Generalized iterative scaling for log-linear models – Darroch, Ratcliff - 1972
272 The infinite Hidden Markov Model – Beal, Ghahramani, et al. - 2003
220 An algorithm that learns what’s in a name – Bikel, Schwartz, et al.
150 Robust part-of-speech tagging using a hidden markov model. Computer speech and language – Kupiec - 1992
142 Adaptive Statistical Language Modeling: A Maximum Entropy Approach – Rosenfeld - 1994
138 A Gaussian prior for smoothing maximum entropy model – Chen, Rosenfeld - 1999
137 Statistical models for text segmentation – Beeferman, Berger, et al. - 1999
129 Maximum Entropy Models for Natural Language Ambiguity Resolution – Ratnaparkhi - 1998
126 Stochastic simulation algorithms for dynamic probabilistic networks – Kanazawa, Koller, et al. - 1995
99 Introduction to probabilistic automata – Paz - 1971
74 Information extraction using hidden Markov models – Leek - 1997
73 Exploiting diverse knowledge sources via maximum entropy in named entity recognition – Borthwiek, Sterling, et al. - 1998
37 question answering from Frequently-Asked Question Files – Burke, Hammond, et al. - 1997
24 Information extraction using hmms and shrinkage – Freitag, McCallum - 1999
18 Efficient sampling and feature selection in whole sentence maximum entropy language models – Chen, Rosenfeld - 1999
6 Markov processes on curves for automatic speech recognition – Saul, Rahim - 1999