Results 1 -
3 of
3
Probabilistic DFA Inference using Kullback-Leibler Divergence and Minimality
- In Seventeenth International Conference on Machine Learning
, 2000
"... Probabilistic DFA inference is the problem of inducing a stochastic regular grammar from a positive sample of an unknown language. The ALERGIA algorithm is one of the most successful approaches to this problem. In the present work we review this algorithm and explain why its generalization criterion ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Probabilistic DFA inference is the problem of inducing a stochastic regular grammar from a positive sample of an unknown language. The ALERGIA algorithm is one of the most successful approaches to this problem. In the present work we review this algorithm and explain why its generalization criterion, a state merging operation, is purely local. This characteristic leads to the conclusion that there is no explicit way to bound the divergence between the distribution de ned by the solution and the training set distribution (that is, to control globally the generalization from the training sample). In this paper we present an alternative approach, the MDI algorithm, in which the solution is a probabilistic automaton that trades o minimal divergence from the training sample and minimal size. An e cient computation of the Kullback-Leibler divergence between two probabilistic DFAs is described, from which the new learning criterion is derived. Empirical results in the d...
A Markovian Approach to the Induction of Regular String Distributions
- In Grammatical Inference: Algorithms and Applications, number 3264 in Lecture Notes in Artificial Intelligence
, 2004
"... We propose in this paper a novel approach to the induction of the structure of Hidden Markov Models (HMMs). The notion of partially observable Markov models (POMMs) is introduced. POMMs form a particular case of HMMs where any state emits a single letter with probability one, but several states ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
We propose in this paper a novel approach to the induction of the structure of Hidden Markov Models (HMMs). The notion of partially observable Markov models (POMMs) is introduced. POMMs form a particular case of HMMs where any state emits a single letter with probability one, but several states can emit the same letter. It is shown that any HMM can be represented by an equivalent POMM. The proposed induction algorithm aims at finding a POMM fitting a sample drawn from an unknown target POMM. The induced model is built to fit the dynamics of the target machine observed in the sample. A POMM is seen as a lumped process of a Markov chain and the induced POMM is constructed to best approximate the stationary distribution and the mean first passage times (MFPT) observed in the sample. The induction relies on iterative state splitting from an initial maximum likelihood model. The transition probabilities of the updated model are found by solving an optimization problem to minimize the di#erence between the observed MFPT and their values computed in the induced model.
Learning Hidden Markov Models to Fit Long-Term Dependencies
, 2005
"... this report a novel approach to the induction of the structure of Hidden Markov Models (HMMs). The notion of partially observable Markov models (POMMs) is introduced. POMMs form a particular case of HMMs where any state emits a single letter with probability one, but several states can emit the ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
this report a novel approach to the induction of the structure of Hidden Markov Models (HMMs). The notion of partially observable Markov models (POMMs) is introduced. POMMs form a particular case of HMMs where any state emits a single letter with probability one, but several states can emit the same letter. It is shown that any HMM can be represented by an equivalent POMM. The proposed induction algorithm aims at finding a POMM fitting the dynamics of the target machine, that is to best approximate the stationary distribution and the mean first passage times observed in the sample. The induction relies on non-linear optimization and iterative state splitting from an initial order one Markov chain. Experimental results illustrate the advantages of the proposed approach as compared to Baum-Welch HMM estimation or back-o# smoothed Ngrams equivalent to variable order Markov chains

