Results 1 -
5 of
5
The Hierarchical Hidden Markov Model: Analysis and Applications
- MACHINE LEARNING
, 1998
"... . We introduce, analyze and demonstrate a recursive hierarchical generalization of the widely used hidden Markov models, which we name Hierarchical Hidden Markov Models (HHMM). Our model is motivated by the complex multi-scale structure which appears in many natural sequences, particularly in langua ..."
Abstract
-
Cited by 176 (3 self)
- Add to MetaCart
. We introduce, analyze and demonstrate a recursive hierarchical generalization of the widely used hidden Markov models, which we name Hierarchical Hidden Markov Models (HHMM). Our model is motivated by the complex multi-scale structure which appears in many natural sequences, particularly in language, handwriting and speech. We seek a systematic unsupervised approach to the modeling of such structures. By extendingthe standard forward-backward(BaumWelch) algorithm, we derive an efficient procedure for estimating the model parameters from unlabeled data. We then use the trained model for automatic hierarchical parsing of observation sequences. We describe two applications of our model and its parameter estimation procedure. In the first application we show how to construct hierarchical models of natural English text. In these models different levels of the hierarchy correspond to structures on different length scales in the text. In the second application we demonstrate how HHMMs can b...
Distributional Part-of-Speech Tagging
- In Proc. of 7th Conference of the European Chapter of the Association for Computational Linguistics
, 1995
"... This paper presents an algorithm for tagging words whose part-of-speech properties are unknown. Unlike previous work, the algorithm categorizes word tokens in context instead of word types. The algorithm is evaluated on the Brown Corpus. ..."
Abstract
-
Cited by 75 (6 self)
- Add to MetaCart
This paper presents an algorithm for tagging words whose part-of-speech properties are unknown. Unlike previous work, the algorithm categorizes word tokens in context instead of word types. The algorithm is evaluated on the Brown Corpus.
Part-of-Speech Tagging Using a Variable Memory Markov Model
- In ACL 32'nd
, 1994
"... We present a new approach to disambiguating syntactically ambiguous words in context, based on Variable Memory Markov (VMM) models. In contrast to fixed-length Markov models, which predict based on fixed-length histories, variable memory Markov models dynamically adapt their history length based on ..."
Abstract
-
Cited by 20 (2 self)
- Add to MetaCart
We present a new approach to disambiguating syntactically ambiguous words in context, based on Variable Memory Markov (VMM) models. In contrast to fixed-length Markov models, which predict based on fixed-length histories, variable memory Markov models dynamically adapt their history length based on the training data, and hence may use fewer parameters. In a test of a VMM based tagger on the Brown corpus, 95.81% of tokens are correctly classified.
The Hidden Vector State Language Model
"... The Hidden Vector State (HVS) model extends the basic Hidden Markov Model (HMM) by encoding each state as a vector of stack states but with restricted stack operations. The model uses a right branching stack automaton to assign valid stochastic parses to a word sequence from which the language model ..."
Abstract
- Add to MetaCart
The Hidden Vector State (HVS) model extends the basic Hidden Markov Model (HMM) by encoding each state as a vector of stack states but with restricted stack operations. The model uses a right branching stack automaton to assign valid stochastic parses to a word sequence from which the language model probability can be estimated. The model is completely data driven and is able to model classes from the data that reflect the hierarchical structures found in natural language. This paper describes the design and the implementation of the HVS language model [1], focusing on the practical issues of initialisation and training using Baum-Welch re-estimation whilst accommodating a large and dynamic state space. Results of experiments conducted using the ATIS corpus [2] show that the HVS language model reduces test set perplexity compared to standard class based language models. 1.
A Spanish POS tagger with variable memory
"... An implementation of a Spanish POS tagger is described in this paper. This implementation combines three basic approaches: a single word tagger based on decision trees, a POS tagger based on variable memory Markov models, and a feature structures set of tags. Using decision trees for single word t ..."
Abstract
- Add to MetaCart
An implementation of a Spanish POS tagger is described in this paper. This implementation combines three basic approaches: a single word tagger based on decision trees, a POS tagger based on variable memory Markov models, and a feature structures set of tags. Using decision trees for single word tagging allows the tagger to work without a lexicon that lists only possible tags. Moreover, it decreases the error rate because there are no unknown words. The feature structure set of tags is advantageous when the available training corpus is small and the tag set large, which can be the case with morphologically rich languages like Spanish. Finally, variable memory Markov models training is more efficient than traditional full-order Markov models and achieves better accuracy. In this implementation, 98.58Y0 of tokens are correctly classified.

