Results 1 -
7 of
7
Probabilistic Finite-State Machines - Part I
"... Probabilistic finite-state machines are used today in a variety of areas in pattern recognition, or in fields to which pattern recognition is linked: computational linguistics, machine learning, time series analysis, circuit testing, computational biology, speech recognition and machine translatio ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Probabilistic finite-state machines are used today in a variety of areas in pattern recognition, or in fields to which pattern recognition is linked: computational linguistics, machine learning, time series analysis, circuit testing, computational biology, speech recognition and machine translation are some of them. In part I of this paper we survey these generative objects and study their definitions and properties. In part II, we will study the relation of probabilistic finite-state automata with other well known devices that generate strings as hidden Markov models and n-grams, and provide theorems, algorithms and properties that represent a current state of the art of these objects.
Probabilistic DFA Inference using Kullback-Leibler Divergence and Minimality
- In Seventeenth International Conference on Machine Learning
, 2000
"... Probabilistic DFA inference is the problem of inducing a stochastic regular grammar from a positive sample of an unknown language. The ALERGIA algorithm is one of the most successful approaches to this problem. In the present work we review this algorithm and explain why its generalization criterion ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Probabilistic DFA inference is the problem of inducing a stochastic regular grammar from a positive sample of an unknown language. The ALERGIA algorithm is one of the most successful approaches to this problem. In the present work we review this algorithm and explain why its generalization criterion, a state merging operation, is purely local. This characteristic leads to the conclusion that there is no explicit way to bound the divergence between the distribution de ned by the solution and the training set distribution (that is, to control globally the generalization from the training sample). In this paper we present an alternative approach, the MDI algorithm, in which the solution is a probabilistic automaton that trades o minimal divergence from the training sample and minimal size. An e cient computation of the Kullback-Leibler divergence between two probabilistic DFAs is described, from which the new learning criterion is derived. Empirical results in the d...
Alternative Approaches for Generating Bodies of Grammar Rules
, 2004
"... We compare two approaches for describing and generating bodies of rules used for natural language parsing. In today's parsers rule bodies do not exist a priori but are generated on the fly, usually with methods based on n-grams, which are one particular way of inducing probabilistic regular language ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
We compare two approaches for describing and generating bodies of rules used for natural language parsing. In today's parsers rule bodies do not exist a priori but are generated on the fly, usually with methods based on n-grams, which are one particular way of inducing probabilistic regular languages. We compare two approaches for inducing such languages. One is based on n-grams, the other on minimization of the Kullback-Leibler divergence. The inferred regular languages are used for generating bodies of rules inside a parsing procedure. We compare the two approaches along two dimensions: the quality of the probabilistic regular language they produce, and the performance of the parser they were used to build. The second approach outperforms the first one along both dimensions.
Learning Hidden Markov Models to Fit Long-Term Dependencies
, 2005
"... this report a novel approach to the induction of the structure of Hidden Markov Models (HMMs). The notion of partially observable Markov models (POMMs) is introduced. POMMs form a particular case of HMMs where any state emits a single letter with probability one, but several states can emit the ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
this report a novel approach to the induction of the structure of Hidden Markov Models (HMMs). The notion of partially observable Markov models (POMMs) is introduced. POMMs form a particular case of HMMs where any state emits a single letter with probability one, but several states can emit the same letter. It is shown that any HMM can be represented by an equivalent POMM. The proposed induction algorithm aims at finding a POMM fitting the dynamics of the target machine, that is to best approximate the stationary distribution and the mean first passage times observed in the sample. The induction relies on non-linear optimization and iterative state splitting from an initial order one Markov chain. Experimental results illustrate the advantages of the proposed approach as compared to Baum-Welch HMM estimation or back-o# smoothed Ngrams equivalent to variable order Markov chains
A Bernoulli mixture model for word categorisation
, 2001
"... The problem of word categorisation is formulated as one of unsupervised mixture modelling where Bernoulli distributions capture contextual information. ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
The problem of word categorisation is formulated as one of unsupervised mixture modelling where Bernoulli distributions capture contextual information.
Ten Open Problems in Grammatical Inference ⋆
"... Abstract. We propose 10 different open problems in the field of grammatical inference. In all cases, problems are theoretically oriented but correspond to practical questions. They cover the areas of polynomial learning models, learning from ordered alphabets, learning deterministic Pomdps, learning ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. We propose 10 different open problems in the field of grammatical inference. In all cases, problems are theoretically oriented but correspond to practical questions. They cover the areas of polynomial learning models, learning from ordered alphabets, learning deterministic Pomdps, learning negotiation processes, learning from context-free background knowledge. 1
Efficient Pruning of Probabilistic Automata 1 Franck Thollard and Baptiste Jeudy
"... Abstract. Applications of probabilistic grammatical inference are limited due to time and space consuming constraints. In statistical language modeling, for example, large corpora are now available and lead to managing automata with millions of states. We propose in this article a method for pruning ..."
Abstract
- Add to MetaCart
Abstract. Applications of probabilistic grammatical inference are limited due to time and space consuming constraints. In statistical language modeling, for example, large corpora are now available and lead to managing automata with millions of states. We propose in this article a method for pruning automata (when restricted to tree based structures) which is not only efficient (sub-quadratic) but that allows to dramatically reduce the size of the automaton with a small impact on the underlying distribution. Results are evaluated on a language modeling task. 1

