Results 1 -
3 of
3
Statistical Models for Text Segmentation
- Machine Learning
, 1999
"... . This paper introduces a new statistical approach to automatically partitioning text into coherent segments. The approach is based on a technique that incrementally builds an exponential model to extract features that are correlated with the presence of boundaries in labeled training text. The mod ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
. This paper introduces a new statistical approach to automatically partitioning text into coherent segments. The approach is based on a technique that incrementally builds an exponential model to extract features that are correlated with the presence of boundaries in labeled training text. The models use two classes of features: topicality features that use adaptive language models in a novel way to detect broad changes of topic, and cue-word features that detect occurrences of specific words, whichmay be domain-specific, that tend to be used near segment boundaries. Assessment of our approachonquantitative and qualitative grounds demonstrates its effectiveness in twovery different domains, Wall Street Journal news articles and television broadcast news story transcripts. Quantitative results on these domains are presented using a new probabilistically motivated error metric, whichcombines precision and recall in a natural and flexible way. This metric is used to make a quantitative ...
Bunsetsu Identification Using Category-Exclusive Rules
, 2000
"... This lmper &'scrilms two new lmnsctsu identification methods using supervised learning'. Since Japmmse syntactic analysis is usmdly (lone after tmnsetsu identiiical,ion, lmnset, su identiiication is iml)Ort;mt tbr analyzing Jai)a.ncse senten(:cs. In exi)erimcnts comparing the tbur previously availab ..."
Abstract
- Add to MetaCart
This lmper &'scrilms two new lmnsctsu identification methods using supervised learning'. Since Japmmse syntactic analysis is usmdly (lone after tmnsetsu identiiical,ion, lmnset, su identiiication is iml)Ort;mt tbr analyzing Jai)a.ncse senten(:cs. In exi)erimcnts comparing the tbur previously available machine- lcm'ning methods (decision tree, maximmn-entrol)y method, examplcd)ased aI)I)roach and decision list) and two new methods using eategory-ex(:hsive rules, the new method using the category-ex(:lusive rules with the highes/, similarity peribrined best.
Sequence Modeling with Mixtures of ConditionalMaximum Entropy Distributions
"... Abstract We present a novel approach to modeling sequences using mixtures ofconditional maximum entropy distributions. Our method generalizes the mixture of first-order Markov models by including the "long-term " de-pendencies in model components. The "long-term " dependencies are represented by the ..."
Abstract
- Add to MetaCart
Abstract We present a novel approach to modeling sequences using mixtures ofconditional maximum entropy distributions. Our method generalizes the mixture of first-order Markov models by including the "long-term " de-pendencies in model components. The "long-term " dependencies are represented by the frequently used in the natural language processing(NLP) domain probabilistic triggers or rules (such as " A occurred k posi-tions back =) the current symbol is B with probability P "). The max-imum entropy framework is then used to create a coherent probabilistic model from all triggers selected for modeling. In order to represent hid-den or unobserved effects in the data we use probabilistic mixtures with maximum entropy models as components. We demonstrate how our mix-ture of conditional maximum entropy models can be learned from data using the EM algorithm that scales linearly in the dimensions of the dataand the number of mixture components. We present empirical results on the simulated and real-world data sets and demonstrate that the proposedapproach enables us to create better quality models than the mixtures of first-order Markov models and resist overfitting and curse of dimen-sionality that would inevitably present themselves for the higher order Markov models.Keywords: Mixture model, maximum entropy, latent structure, sequential data. 1 Introduction Analyzing protein families or DNA sequences, understanding behavior of a Web user at aWeb site, preventing intrusions on a UNIX system by studying the commands issued by the users, recommending books to the customers of the Internet bookstore--all these andmany other tasks can be reduced to a sequence modeling problem. Consider the following probabilistic statement of a discrete sequence modeling problem.Given a set of discrete sequences

