Results 1  10
of
39
LSTM Recurrent Networks Learn Simple Context Free and Context Sensitive Languages
 IEEE Transactions on Neural Networks
, 2001
"... Previous work on learning regular languages from exemplary training sequences showed that Long Short Term Memory (LSTM) outperforms traditional recurrent neural networks (RNNs). Here we demonstrate LSTM's superior performance on context free language (CFL) benchmarks for recurrent neural networks ..."
Abstract

Cited by 57 (21 self)
 Add to MetaCart
Previous work on learning regular languages from exemplary training sequences showed that Long Short Term Memory (LSTM) outperforms traditional recurrent neural networks (RNNs). Here we demonstrate LSTM's superior performance on context free language (CFL) benchmarks for recurrent neural networks (RNNs), and show that it works even better than previous hardwired or highly specialized architectures.
Learning Regular Languages From Simple Positive Examples
, 2000
"... Learning from positive data constitutes an important topic in Grammatical Inference since it is believed that the acquisition of grammar by children only needs syntactically correct (i.e. positive) instances. However, classical learning models provide no way to avoid the problem of overgeneralizati ..."
Abstract

Cited by 23 (0 self)
 Add to MetaCart
Learning from positive data constitutes an important topic in Grammatical Inference since it is believed that the acquisition of grammar by children only needs syntactically correct (i.e. positive) instances. However, classical learning models provide no way to avoid the problem of overgeneralization. In order to overcome this problem, we use here a learning model from simple examples, where the notion of simplicity is defined with the help of Kolmogorov complexity. We show that a general and natural heuristic which allows learning from simple positive examples can be developed in this model. Our main result is that the class of regular languages is probably exactly learnable from simple positive examples.
Learning Deterministic Regular Expressions for the Inference of Schemas from XML Data
, 2008
"... Inferring an appropriate DTD or XML Schema Definition (XSD) for a given collection of XML documents essentially reduces to learning deterministic regular expressions from sets of positive example words. Unfortunately, there is no algorithm capable of learning the complete class of deterministic regu ..."
Abstract

Cited by 22 (4 self)
 Add to MetaCart
Inferring an appropriate DTD or XML Schema Definition (XSD) for a given collection of XML documents essentially reduces to learning deterministic regular expressions from sets of positive example words. Unfortunately, there is no algorithm capable of learning the complete class of deterministic regular expressions from positive examples only, as we will show. The regular expressions occurring in practical DTDs and XSDs, however, are such that every alphabet symbol occurs only a small number of times. As such, in practice it suffices to learn the subclass of regular expressions in which each alphabet symbol occurs at most k times, for some small k. We refer to such expressions as koccurrence regular expressions (kOREs for short). Motivated by this observation, we provide a probabilistic algorithm that learns kOREs for increasing values of k, and selects the one that best describes the sample based on a Minimum Description Length argument. The effectiveness of the method is empirically validated both on real world and synthetic data. Furthermore, the method is shown to be conservative over the simpler classes of expressions considered in previous work.
Logicbased Genetic Programming with Definite Clause Translation Grammars
 NEW GENERATION COMPUTING
, 2001
"... DCTGGP is a genetic programming system that uses definite clause translation grammars. A DCTG is a logical version of an attribute grammar that supports the definition of context–free languages, and it allows semantic information associated with a language to be easily accomodated by the grammar. T ..."
Abstract

Cited by 20 (10 self)
 Add to MetaCart
DCTGGP is a genetic programming system that uses definite clause translation grammars. A DCTG is a logical version of an attribute grammar that supports the definition of context–free languages, and it allows semantic information associated with a language to be easily accomodated by the grammar. This is useful in genetic programming for defining the interpreter of a target language, or incorporating both syntactic and semantic problem–specific contraints into the evolutionary search. The DCTGGP system improves on other grammar–based GP systems by permitting non–trivial semantic aspects of the language to be defined with the grammar. It also automatically analyzes grammar rules in order to determine their minimal depth and termination characteristics, which are required when generating random program trees of varied shapes and sizes. An application using DCTGGP is described.
Information Extraction in Structured Documents using Tree Automata Induction
, 2002
"... Information extraction (IE) addresses the problem of extracting speci c information from a collection of documents. Much of the previous work for IE from structured documents formatted in HTML or XML uses techniques for IE from strings, such as grammar and automata induction. However, such docu ..."
Abstract

Cited by 19 (5 self)
 Add to MetaCart
Information extraction (IE) addresses the problem of extracting speci c information from a collection of documents. Much of the previous work for IE from structured documents formatted in HTML or XML uses techniques for IE from strings, such as grammar and automata induction. However, such documents have a tree structure. Hence it is natural to investigate methods that are able to recognise and exploit this tree structure. We do this by exploring the use of tree automata for IE in structured documents. Experimental results on benchmark data sets show that our approach compares favorably with previous approaches.
Computational Complexity of Problems on Probabilistic Grammars and Transducers.
 In Proc. ICGI
, 2000
"... Determinism plays an important role in grammatical inference. ..."
Abstract

Cited by 19 (3 self)
 Add to MetaCart
Determinism plays an important role in grammatical inference.
Using grammatical inference to automate information extraction from the web
 In Principles of Data Mining and Knowledge Discovery
, 2001
"... Abstract. The WorldWide Web contains a wealth of semistructured information sources that often give partial/overlapping views on the same domains, such as real estate listings or book prices. These partial sources could be used more effectively if integrated into a single view; however, since they ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
Abstract. The WorldWide Web contains a wealth of semistructured information sources that often give partial/overlapping views on the same domains, such as real estate listings or book prices. These partial sources could be used more effectively if integrated into a single view; however, since they are typically formatted in diverse ways for human viewing, extracting their data for integration is a difficult challenge. Existing learning systems for this task generally use hardcoded ad hoc heuristics, are restricted in the domains and structures they can recognize, and/or require manual training. We describe a principled method for automatically generating extraction wrappers using grammatical inference that can recognize general structures and does not rely on manuallylabelled examples. Domainspecific knowledge is explicitly separated out in the form of declarative rules. The method is demonstrated in a test setting by extracting real estate listings from web pages and integrating them into an interactive data visualization tool based on dynamic queries. 1
Probabilistic DFA Inference using KullbackLeibler Divergence and Minimality
 In Seventeenth International Conference on Machine Learning
, 2000
"... Probabilistic DFA inference is the problem of inducing a stochastic regular grammar from a positive sample of an unknown language. The ALERGIA algorithm is one of the most successful approaches to this problem. In the present work we review this algorithm and explain why its generalization criterion ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
Probabilistic DFA inference is the problem of inducing a stochastic regular grammar from a positive sample of an unknown language. The ALERGIA algorithm is one of the most successful approaches to this problem. In the present work we review this algorithm and explain why its generalization criterion, a state merging operation, is purely local. This characteristic leads to the conclusion that there is no explicit way to bound the divergence between the distribution de ned by the solution and the training set distribution (that is, to control globally the generalization from the training sample). In this paper we present an alternative approach, the MDI algorithm, in which the solution is a probabilistic automaton that trades o minimal divergence from the training sample and minimal size. An e cient computation of the KullbackLeibler divergence between two probabilistic DFAs is described, from which the new learning criterion is derived. Empirical results in the d...
Inference of Node Replacement Recursive Graph Grammars
 SIXTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2006
, 2001
"... Graph grammars combine the relational aspect of graphs with the iterative and recursive aspects of string grammars, and thus represent an important next step in our ability to discover knowledge from data. In this paper we describe an approach to learning node replacement graph grammars. This approa ..."
Abstract

Cited by 10 (6 self)
 Add to MetaCart
Graph grammars combine the relational aspect of graphs with the iterative and recursive aspects of string grammars, and thus represent an important next step in our ability to discover knowledge from data. In this paper we describe an approach to learning node replacement graph grammars. This approach is based on previous research in frequent isomorphic subgraphs discovery. We extend the search for frequent subgraphs by checking for overlap among the instances of the subgraphs in the input graph. If subgraphs overlap by one node we propose a node replacement grammar production. We also can infer a hierarchy of productions by compressing portions of a graph described by a production and then infer new productions on the compressed graph. We validate this approach in experiments where we generate graphs from known grammars and measure how well our system infers the original grammar from the generated graph. We also describe results on several realworld tasks from chemical mining to XML schema induction. We briefly discuss other grammar inference systems indicating that our study extends classes of learnable graph grammars.
Inference of Concise Regular Expressions and DTDs
"... We consider the problem of inferring a concise Document Type Definition (DTD) for a given set of XMLdocuments, a problem that basically reduces to learning concise regular expressions from positive examples strings. We identify two classes of concise regular expressions—the single occurrence regula ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
We consider the problem of inferring a concise Document Type Definition (DTD) for a given set of XMLdocuments, a problem that basically reduces to learning concise regular expressions from positive examples strings. We identify two classes of concise regular expressions—the single occurrence regular expressions (SOREs) and the chain regular expressions (CHAREs)—that capture the far majority of expressions used in practical DTDs. For the inference of SOREs we present several algorithms that first infer an automaton for a given set of example strings and then translate that automaton to a corresponding SORE, possibly repairing the automaton when no equivalent SORE can be found. In the process, we introduce a novel automaton to regular expression rewrite technique which is of independent interest. When only a very small amount of XML data is available, however (for instance when the data is generated by Web service requests or by answers to queries), these algorithms produce regular expressions that are too specific. Therefore, we introduce a novel learning algorithm CRX that directly infers CHAREs (which form a subclass of SOREs) without going through an automaton representation. We show that CRX performs very well within its target class on very small datasets. 11