Results 1 -
7 of
7
PAC-learnability of Probabilistic Deterministic Finite State Automata
- JOURNAL OF MACHINE LEARNING RESEARCH
, 2004
"... We study the learnability of Probabilistic Deterministic Finite State Automata under a modified PAC-learning criterion. We argue that it is necessary to add additional parameters to the sample complexity polynomial, namely a bound on the expected length of strings generated from any state, and a ..."
Abstract
-
Cited by 19 (7 self)
- Add to MetaCart
We study the learnability of Probabilistic Deterministic Finite State Automata under a modified PAC-learning criterion. We argue that it is necessary to add additional parameters to the sample complexity polynomial, namely a bound on the expected length of strings generated from any state, and a bound on the distinguishability between states. With this, we demonstrate that the class of PDFAs is PAC-learnable using a variant of a standard state-merging algorithm and the KullbackLeibler divergence as error function.
Probabilistic Finite-State Machines - Part I
"... Probabilistic finite-state machines are used today in a variety of areas in pattern recognition, or in fields to which pattern recognition is linked: computational linguistics, machine learning, time series analysis, circuit testing, computational biology, speech recognition and machine translatio ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Probabilistic finite-state machines are used today in a variety of areas in pattern recognition, or in fields to which pattern recognition is linked: computational linguistics, machine learning, time series analysis, circuit testing, computational biology, speech recognition and machine translation are some of them. In part I of this paper we survey these generative objects and study their definitions and properties. In part II, we will study the relation of probabilistic finite-state automata with other well known devices that generate strings as hidden Markov models and n-grams, and provide theorems, algorithms and properties that represent a current state of the art of these objects.
Improvement of the State Merging Rule on Noisy Data in Probabilistic Grammatical Inference
- 10th European Conference on Machine Learning. Number 2837 in LNAI, Springer-Verlag (2003) 169–1180
, 2003
"... In this paper we study the influence of noise in probabilistic grammatical inference. We paradoxically bring out the idea that specialized automata deal better with noisy data than more general ones. We propose then to replace the statistical test of the Alergia algorithm by a more restrictive m ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In this paper we study the influence of noise in probabilistic grammatical inference. We paradoxically bring out the idea that specialized automata deal better with noisy data than more general ones. We propose then to replace the statistical test of the Alergia algorithm by a more restrictive merging rule based on a test of proportion comparison.
Learning Hidden Markov Models to Fit Long-Term Dependencies
, 2005
"... this report a novel approach to the induction of the structure of Hidden Markov Models (HMMs). The notion of partially observable Markov models (POMMs) is introduced. POMMs form a particular case of HMMs where any state emits a single letter with probability one, but several states can emit the ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
this report a novel approach to the induction of the structure of Hidden Markov Models (HMMs). The notion of partially observable Markov models (POMMs) is introduced. POMMs form a particular case of HMMs where any state emits a single letter with probability one, but several states can emit the same letter. It is shown that any HMM can be represented by an equivalent POMM. The proposed induction algorithm aims at finding a POMM fitting the dynamics of the target machine, that is to best approximate the stationary distribution and the mean first passage times observed in the sample. The induction relies on non-linear optimization and iterative state splitting from an initial order one Markov chain. Experimental results illustrate the advantages of the proposed approach as compared to Baum-Welch HMM estimation or back-o# smoothed Ngrams equivalent to variable order Markov chains
Towards Feasible PAC-Learning of Probabilistic Deterministic Finite Automata
, 2008
"... We present an improvement of an algorithm due to Clark and ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We present an improvement of an algorithm due to Clark and
in Language Learning
"... We present a computational model of language learning via a sequence of interactions between a teacher and a learner. The utterances of the teacher and learner refer to shared situations, and the learner uses cross-situational correspondences to learn to comprehend the teacher’s utterances and produ ..."
Abstract
- Add to MetaCart
We present a computational model of language learning via a sequence of interactions between a teacher and a learner. The utterances of the teacher and learner refer to shared situations, and the learner uses cross-situational correspondences to learn to comprehend the teacher’s utterances and produce appropriate utterances of its own. We show that in this model the teacher and learner come to be able to understand each other’s meanings. Moreover, the teacher is able to produce meaning-preserving corrections of the learner’s utterances, and the learner is able to detect them. We test our model with limited sublanguages of several natural languages in a common domain of situations. The results show that learning to a high level of performance occurs after a reasonable number of interactions. Moreover, even if the learner does not treat corrections specially, in several cases a high level of performance is achieved significantly sooner by a learner interacting with a correcting teacher than by a learner interacting with a non-correcting teacher. Demonstrating the benefit of semantics to the learner, we compare the number of interactions to reach a high level of performance in our system with the number of similarly generated utterances (with no semantics) required by the ALERGIA algorithm to achieve the same level of performance. We also define and analyze a simplified model of a probabilistic process of collecting corrections to help understand the possibilities and limitations of corrections in our setting. 1
Efficient Pruning of Probabilistic Automata 1 Franck Thollard and Baptiste Jeudy
"... Abstract. Applications of probabilistic grammatical inference are limited due to time and space consuming constraints. In statistical language modeling, for example, large corpora are now available and lead to managing automata with millions of states. We propose in this article a method for pruning ..."
Abstract
- Add to MetaCart
Abstract. Applications of probabilistic grammatical inference are limited due to time and space consuming constraints. In statistical language modeling, for example, large corpora are now available and lead to managing automata with millions of states. We propose in this article a method for pruning automata (when restricted to tree based structures) which is not only efficient (sub-quadratic) but that allows to dramatically reduce the size of the automaton with a small impact on the underlying distribution. Results are evaluated on a language modeling task. 1

