Results 1  10
of
47
Convolution Kernels on Discrete Structures
, 1999
"... We introduce a new method of constructing kernels on sets whose elements are discrete structures like strings, trees and graphs. The method can be applied iteratively to build a kernel on an infinite set from kernels involving generators of the set. The family of kernels generated generalizes the fa ..."
Abstract

Cited by 403 (0 self)
 Add to MetaCart
(Show Context)
We introduce a new method of constructing kernels on sets whose elements are discrete structures like strings, trees and graphs. The method can be applied iteratively to build a kernel on an infinite set from kernels involving generators of the set. The family of kernels generated generalizes the family of radial basis kernels. It can also be used to define kernels in the form of joint Gibbs probability distributions. Kernels can be built from hidden Markov random elds, generalized regular expressions, pairHMMs, or ANOVA decompositions. Uses of the method lead to open problems involving the theory of infinitely divisible positive definite functions. Fundamentals of this theory and the theory of reproducing kernel Hilbert spaces are reviewed and applied in establishing the validity of the method.
Two Experiments on Learning Probabilistic Dependency Grammars from Corpora
 Working Notes of the Workshop StatisticallyBased NLP Techniques
, 1992
"... Introduction We present a scheme for learning probabilistic dependency grammars from positive training examples plus constraints on rules. In particular we present the results of two experiments. The first, in which the constraints were minimal, was unsuccessful. The second, with significant constr ..."
Abstract

Cited by 99 (5 self)
 Add to MetaCart
(Show Context)
Introduction We present a scheme for learning probabilistic dependency grammars from positive training examples plus constraints on rules. In particular we present the results of two experiments. The first, in which the constraints were minimal, was unsuccessful. The second, with significant constraints, was successful within the bounds of the task we had set. We will explicate dependency grammars in Section 2. For the moment we simply note that they are a very restricted class of grammars which do not fit exactly into the Chomsky hierarchy, but whose appearance is most like the contextfree grammars. We assume that the goal of learning a contextfree grammar needs no justification. The problem has attracted a fair amount of attention, ( [1,4] are good surveys. ) but no good solutions have been found. Our choice of learning from only positive training examples needs only a little more justification. Obviously, if it is possible, a scheme which only uses positive training exampl
System Identification, Approximation and Complexity
 International Journal of General Systems
, 1977
"... This paper is concerned with establishing broadlybased systemtheoretic foundations and practical techniques for the problem of system identification that are rigorous, intuitively clear and conceptually powerful. A general formulation is first given in which two order relations are postulated on a ..."
Abstract

Cited by 36 (22 self)
 Add to MetaCart
(Show Context)
This paper is concerned with establishing broadlybased systemtheoretic foundations and practical techniques for the problem of system identification that are rigorous, intuitively clear and conceptually powerful. A general formulation is first given in which two order relations are postulated on a class of models: a constant one of complexity; and a variable one of approximation induced by an observed behaviour. An admissible model is such that any less complex model is a worse approximation. The general problem of identification is that of finding the admissible subspace of models induced by a given behaviour. It is proved under very general assumptions that, if deterministic models are required then nearly all behaviours require models of nearly maximum complexity. A general theory of approximation between models and behaviour is then developed based on subjective probability concepts and semantic information theory The role of structural constraints such as causality, locality, finite memory, etc., are then discussed as rules of the game. These concepts and results are applied to the specific problem or stochastic automaton, or grammar, inference. Computational results are given to demonstrate that the theory is complete and fully operational. Finally the formulation of identification proposed in this paper is analysed in terms of Klir’s epistemological hierarchy and both are discussed in terms of the rich philosophical literature on the acquisition of knowledge. 1
Incremental Regular Inference
 Proceedings of the Third ICGI96
, 1996
"... In this paper, we extend the characterization of the search space of regular inference [DMV94] to sequential presentations of learning data. We propose the RPNI2 algorithm, an incremental extension of the RPNI algorithm. We study the convergence and complexities of both algorithms from a theoretical ..."
Abstract

Cited by 33 (2 self)
 Add to MetaCart
In this paper, we extend the characterization of the search space of regular inference [DMV94] to sequential presentations of learning data. We propose the RPNI2 algorithm, an incremental extension of the RPNI algorithm. We study the convergence and complexities of both algorithms from a theoretical and practical point of view. These results are assessed on the Feldman task. 1 Introduction Regular inference is the problem of learning a regular language from a positive sample, that is, a finite set of strings supposed to be drawn from a target language. Whenever a negative sample, that is, a finite set of strings not belonging to the target language, is also available, the problem may be solved by the RPNI algorithm 1 proposed by Oncina and Garc'ia [OG92] and, independently, by Lang [Lan92]. The RPNI algorithm has been shown to identify in the limit any regular language with polynomial complexity as a function of the positive and negative sample sizes. However, this algorithm requir...
Logicbased Genetic Programming with Definite Clause Translation Grammars
 NEW GENERATION COMPUTING
, 2001
"... DCTGGP is a genetic programming system that uses definite clause translation grammars. A DCTG is a logical version of an attribute grammar that supports the definition of context–free languages, and it allows semantic information associated with a language to be easily accomodated by the grammar. T ..."
Abstract

Cited by 22 (10 self)
 Add to MetaCart
(Show Context)
DCTGGP is a genetic programming system that uses definite clause translation grammars. A DCTG is a logical version of an attribute grammar that supports the definition of context–free languages, and it allows semantic information associated with a language to be easily accomodated by the grammar. This is useful in genetic programming for defining the interpreter of a target language, or incorporating both syntactic and semantic problem–specific contraints into the evolutionary search. The DCTGGP system improves on other grammar–based GP systems by permitting non–trivial semantic aspects of the language to be defined with the grammar. It also automatically analyzes grammar rules in order to determine their minimal depth and termination characteristics, which are required when generating random program trees of varied shapes and sizes. An application using DCTGGP is described.
Grammar Inference, Automata Induction, and Language Acquisition
 Handbook of Natural Language Processing
, 2000
"... The natural language learning problem has attracted the attention of researchers for several decades. Computational and formal models of language acquisition have provided some preliminary, yet promising insights of how children learn the language of their community. Further, these formal models als ..."
Abstract

Cited by 22 (1 self)
 Add to MetaCart
(Show Context)
The natural language learning problem has attracted the attention of researchers for several decades. Computational and formal models of language acquisition have provided some preliminary, yet promising insights of how children learn the language of their community. Further, these formal models also provide an operational framework for the numerous practical applications of language learning. We will survey some of the key results in formal language learning. In particular, we will discuss the prominent computational approaches for learning different classes of formal languages and discuss how these fit in the broad context of natural language learning.
Learning a Class of Large Finite State Machines with a Recurrent Neural Network
, 1995
"... One of the issues in any learning model is how it scales with problem size. The problem of learning finite state machine (FSMs) from examples with recurrent neural networks has been extensively explored. However, these results are somewhat disappointing in the sense that the machines that can be le ..."
Abstract

Cited by 22 (11 self)
 Add to MetaCart
One of the issues in any learning model is how it scales with problem size. The problem of learning finite state machine (FSMs) from examples with recurrent neural networks has been extensively explored. However, these results are somewhat disappointing in the sense that the machines that can be learned are too small to be competitive with existing grammatical inference algorithms. We show that a type of recurrent neural network (Narendra & Parthasarathy, 1990, IEEE Trans. Neural Networks, 1, 427) which has feedback but no hidden state neurons can learn a special type of FSM called a finite memory machine (FMM) under certain constraints. These machines have a large number of states (simulations are for 256 and 512 state FMMs) but have minimal order, relatively small depth and little logic when the FMM is implemented as a sequential machine,
Probabilistic FiniteState Machines  Part I
"... Probabilistic finitestate machines are used today in a variety of areas in pattern recognition, or in fields to which pattern recognition is linked: computational linguistics, machine learning, time series analysis, circuit testing, computational biology, speech recognition and machine translatio ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
Probabilistic finitestate machines are used today in a variety of areas in pattern recognition, or in fields to which pattern recognition is linked: computational linguistics, machine learning, time series analysis, circuit testing, computational biology, speech recognition and machine translation are some of them. In part I of this paper we survey these generative objects and study their definitions and properties. In part II, we will study the relation of probabilistic finitestate automata with other well known devices that generate strings as hidden Markov models and ngrams, and provide theorems, algorithms and properties that represent a current state of the art of these objects.
Automatic Acquisition of Language Models for Speech Recognition
, 1994
"... This thesis focuses on the automatic acquisition of language structure and the subsequent use of the learned language structure to improve the performance of a speech recognition system. First, we develop a grammar inference process which is able to learn a grammar describing a large set of training ..."
Abstract

Cited by 14 (3 self)
 Add to MetaCart
This thesis focuses on the automatic acquisition of language structure and the subsequent use of the learned language structure to improve the performance of a speech recognition system. First, we develop a grammar inference process which is able to learn a grammar describing a large set of training sentences. The process of acquiring this grammar is one of generalization so that the resulting grammar predicts likely sentences beyond those contained in the training set. From the grammar we construct a novel probabilistic language model called the phrase class ngram model (pcng), which is a natural generalization of the word class ngram model [11] to phrase classes. This model utilizes the grammar in such a way that it maintains full coverage of any test set while at the same time reducing the complexity, or number of parameters, of the resulting predictive model. Positive results are shown in terms of perplexity of the acquired phrase class ngram models and in terms of reduction of ...
A Polynomial Time Incremental Algorithm for Learning DFA
"... We present an efficient incremental algorithm for learning deterministic finite state automata (DFA) from labeled examples and membership queries. This algorithm is an extension of Angluin's ID procedure to an incremental framework. The learning algorithm is intermittently provided with lab ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
We present an efficient incremental algorithm for learning deterministic finite state automata (DFA) from labeled examples and membership queries. This algorithm is an extension of Angluin's ID procedure to an incremental framework. The learning algorithm is intermittently provided with labeled examples and has access to a knowledgeable teacher capable of answering membership queries. The learner constructs an initial hypothesis from the given set of labeled examples and the teacher's responses to membership queries. If an additional example observed by the learner is inconsistent with the current hypothesis then the hypothesis is modified minimally to make it consistent with the new example. The update procedure ensures that the modified hypothesis is consistent with all examples observed thus far. The algorithm is guaranteed to converge to a minimum state DFA corresponding to the target when the set of examples observed by the learner includes a live complete set. We prove the convergence of this algorithm and analyze its time and space complexities.