Results 1 
3 of
3
Probabilistic DFA Inference using KullbackLeibler Divergence and Minimality
 In Seventeenth International Conference on Machine Learning
, 2000
"... Probabilistic DFA inference is the problem of inducing a stochastic regular grammar from a positive sample of an unknown language. The ALERGIA algorithm is one of the most successful approaches to this problem. In the present work we review this algorithm and explain why its generalization criterion ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
Probabilistic DFA inference is the problem of inducing a stochastic regular grammar from a positive sample of an unknown language. The ALERGIA algorithm is one of the most successful approaches to this problem. In the present work we review this algorithm and explain why its generalization criterion, a state merging operation, is purely local. This characteristic leads to the conclusion that there is no explicit way to bound the divergence between the distribution de ned by the solution and the training set distribution (that is, to control globally the generalization from the training sample). In this paper we present an alternative approach, the MDI algorithm, in which the solution is a probabilistic automaton that trades o minimal divergence from the training sample and minimal size. An e cient computation of the KullbackLeibler divergence between two probabilistic DFAs is described, from which the new learning criterion is derived. Empirical results in the d...
Ten Open Problems in Grammatical Inference ⋆
"... Abstract. We propose 10 different open problems in the field of grammatical inference. In all cases, problems are theoretically oriented but correspond to practical questions. They cover the areas of polynomial learning models, learning from ordered alphabets, learning deterministic Pomdps, learning ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Abstract. We propose 10 different open problems in the field of grammatical inference. In all cases, problems are theoretically oriented but correspond to practical questions. They cover the areas of polynomial learning models, learning from ordered alphabets, learning deterministic Pomdps, learning negotiation processes, learning from contextfree background knowledge. 1
LARS: A learning algorithm for rewriting systems
, 2007
"... Whereas there is a number of methods and algorithms to learn regular languages, moving up the Chomsky hierarchy is proving to be a challenging task. Indeed, several theoretical barriers make the class of contextfree languages hard to learn. To tackle these barriers, we choose to change the way we r ..."
Abstract
 Add to MetaCart
Whereas there is a number of methods and algorithms to learn regular languages, moving up the Chomsky hierarchy is proving to be a challenging task. Indeed, several theoretical barriers make the class of contextfree languages hard to learn. To tackle these barriers, we choose to change the way we represent these languages. Among the formalisms that allow the definition of classes of languages, the one of stringrewriting systems (SRS) has outstanding properties. We introduce a new type of SRS’s, called Delimited SRS (DSRS), that are expressive enough to define, in a uniform way, a noteworthy and non trivial class of languages that contains all the regular languages, {a n b n: n ≥ 0}, {w ∈{a, b} ∗ : wa =wb}, the parenthesis languages of Dyck, the language of Lukasiewicz, and many others. Moreover, DSRS’s constitute an efficient (often linear) parsing device for strings, and are thus promising candidates in forthcoming applications of grammatical inference. In this paper, we pioneer the problem of their learnability. We propose a novel and sound algorithm (called LARS) which identifies a large subclass of them in polynomial time (but not data). We illustrate the execution of our algorithm through several examples, discuss the position of the class in the Chomsky hierarchy and finally raise some open questions and research directions.