Results 1  10
of
34
The induction of dynamical recognizers
 Machine Learning
, 1991
"... A higher order recurrent neural network architecture learns to recognize and generate languages after being "trained " on categorized exemplars. Studying these networks from the perspective of dynamical systems yields two interesting discoveries: First, a longitudinal examination of the learning pro ..."
Abstract

Cited by 210 (14 self)
 Add to MetaCart
A higher order recurrent neural network architecture learns to recognize and generate languages after being "trained " on categorized exemplars. Studying these networks from the perspective of dynamical systems yields two interesting discoveries: First, a longitudinal examination of the learning process illustrates a new form of mechanical inference: Induction by phase transition. A small weight adjustment causes a "bifurcation" in the limit behavior of the network. This phase transition corresponds to the onset of the networkâ€™s capacity for generalizing to arbitrarylength strings. Second, a study of the automata resulting from the acquisition of previously published training sets indicates that while the architecture is not guaranteed to find a minimal finite automaton consistent with the given exemplars, which is an NPHard problem, the architecture does appear capable of generating nonregular languages by exploiting fractal and chaotic dynamics. I end the paper with a hypothesis relating linguistic generative capacity to the behavioral regimes of nonlinear dynamical systems.
An Input Output HMM Architecture
 ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS
, 1995
"... We introduce a recurrent architecture having a modular structure and we formulate a training procedure based on the EM algorithm. The resulting model has similarities to hidden Markov models, but supports recurrent networks processing style and allows to exploit the supervised learning paradigm ..."
Abstract

Cited by 108 (15 self)
 Add to MetaCart
We introduce a recurrent architecture having a modular structure and we formulate a training procedure based on the EM algorithm. The resulting model has similarities to hidden Markov models, but supports recurrent networks processing style and allows to exploit the supervised learning paradigm while using maximum likelihood estimation.
Input/output hmms for sequence processing
 IEEE Transactions on Neural Networks
, 1996
"... We consider problems of sequence processing and propose a solution based on a discrete state model in order to represent past context. Weintroduce a recurrent connectionist architecture having a modular structure that associates a subnetwork to each state. The model has a statistical interpretation ..."
Abstract

Cited by 97 (12 self)
 Add to MetaCart
We consider problems of sequence processing and propose a solution based on a discrete state model in order to represent past context. Weintroduce a recurrent connectionist architecture having a modular structure that associates a subnetwork to each state. The model has a statistical interpretation we call Input/Output Hidden Markov Model (IOHMM). It can be trained by the EM or GEM algorithms, considering state trajectories as missing data, which decouples temporal credit assignment and actual parameter estimation. The model presents similarities to hidden Markov models (HMMs), but allows us to map input sequences to output sequences, using the same processing style as recurrent neural networks. IOHMMs are trained using a more discriminant learning paradigm than HMMs, while potentially taking advantage of the EM algorithm. We demonstrate that IOHMMs are well suited for solving grammatical inference problems on a benchmark problem. Experimental results are presented for the seven Tomita grammars, showing that these adaptive models can attain excellent generalization.
Continual Learning In Reinforcement Environments
, 1994
"... Continual learning is the constant development of complex behaviors with no final end in mind. It is the process of learning ever more complicated skills by building on those skills already developed. In order for learning at one stage of development to serve as the foundation for later learning, a ..."
Abstract

Cited by 74 (13 self)
 Add to MetaCart
Continual learning is the constant development of complex behaviors with no final end in mind. It is the process of learning ever more complicated skills by building on those skills already developed. In order for learning at one stage of development to serve as the foundation for later learning, a continuallearning agent should learn hierarchically. CHILD, an agent capable of Continual, Hierarchical, Incremental Learning and Development is proposed, described, tested, and evaluated in this dissertation. CHILD accumulates useful behaviors in reinforcement environments by using the Temporal Transition Hierarchies learning algorithm, also derived in the dissertation. This constructive algorithm generates a hierarchical, higherorder neural network that can be used for predicting contextdependent temporal sequences and can learn sequentialtask benchmarks more than two orders of magnitude faster than competing neuralnetwork systems. Consequently, CHILD can quickly solve complicated non...
Noisy Time Series Prediction using a Recurrent Neural Network and Grammatical Inference
 Machine Learning
, 2001
"... Financial forecasting is an example of a signal processing problem which is challenging due to small sample sizes, high noise, nonstationarity, and nonlinearity. Neural networks have been very successful in a number of signal processing applications. We discuss fundamental limitations and inherent ..."
Abstract

Cited by 47 (0 self)
 Add to MetaCart
Financial forecasting is an example of a signal processing problem which is challenging due to small sample sizes, high noise, nonstationarity, and nonlinearity. Neural networks have been very successful in a number of signal processing applications. We discuss fundamental limitations and inherent difficulties when using neural networks for the processing of high noise, small sample size signals. We introduce a new intelligent signal processing method which addresses the difficulties. The method proposed uses conversion into a symbolic representation with a selforganizing map, and grammatical inference with recurrent neural networks. We apply the method to the prediction of daily foreign exchange rates, addressing difficulties with nonstationarity, overfitting, and unequal a priori class probabilities, and we find significant predictability in comprehensive experiments covering 5 different foreign exchange rates. The method correctly predicts the direction of change for th...
Natural language grammatical inference with recurrent neural networks
 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
, 1998
"... This paper examines the inductive inference of a complex grammar with neural networks  specifically, the task considered is that of training a network to classify natural language sentences as grammatical or ungrammatical, thereby exhibiting the same kind of discriminatory power provided by the P ..."
Abstract

Cited by 45 (1 self)
 Add to MetaCart
This paper examines the inductive inference of a complex grammar with neural networks  specifically, the task considered is that of training a network to classify natural language sentences as grammatical or ungrammatical, thereby exhibiting the same kind of discriminatory power provided by the Principles and Parameters linguistic framework, or GovernmentandBinding theory. Neural networks are trained, without the division into learned vs. innate components assumed by Chomsky, in an attempt to produce the same judgments as native speakers on sharply grammatical/ungrammatical data. How a recurrent neural network could possess linguistic capability and the properties of various common recurrent neural network architectures are discussed. The problem exhibits training behavior which is often not present with smaller grammars and training was initially difficult. However, after implementing several techniques aimed at improving the convergence of the gradient descent backpropagationthroughtime training algorithm, significant learning was possible. It was found that certain architectures are better able to learn an appropriate grammar. The operation of the networks and their training is analyzed. Finally, the extraction of rules in the form of deterministic finite state automata is investigated.
Representation of Finite State Automata in Recurrent Radial Basis Function Networks
, 1996
"... to :hs paper we propose some techniques ft>r injccling linite Stale automata rate l.ec:rr,zn Radial Basis Functlt>n networks (R2BF). When providing proper hints and constraining the v,oght space prlpe'ly. we show that thc,e nelworks behave as automata. A teebraque is snggcsted /"t ebrorag the lemmn ..."
Abstract

Cited by 36 (5 self)
 Add to MetaCart
to :hs paper we propose some techniques ft>r injccling linite Stale automata rate l.ec:rr,zn Radial Basis Functlt>n networks (R2BF). When providing proper hints and constraining the v,oght space prlpe'ly. we show that thc,e nelworks behave as automata. A teebraque is snggcsted /"t ebrorag the lemmng process re develop aulomata representationq that is based on adding a pro)per penalty tunelton to the mdinary cost. Successful experinental results are shown for tuducttvc mcrenc.' 1 regular gramrnar Keywords: Attemala, backpropagation t[rough trine, high(rder neural networks, induclix. c reference. learning item hints. radial basis ftlnctions, rectarent radial basra tnnclmns. recurrent netw(>rks 1. introduction The ability (>f learning fiom examples is certainly lhe most appealing l'eature c)f neu ral networks. In the last lw years, several researchers have used conncctontst models for solving different kinds ol probfoms ranging from robot control to pattern recogmtioa Coping wilh optimization of [unctions with several thousands of x, ariablcs s quite common Surprisingly, in many practical cases, global or near global r)ptimization is attained also wth non sophistteated numertcal methods. For example, successlul applications of neural nets fi)r recognition of handwritten characters (le Cun, 189) md for phoncmc discrimination (Waibcl c al., 1989) ave bccn proposed which d() n<,t report serious convergence problems Some attempts to understand the theoretical reasons )r lhc successes and atlures of supervised }earrang schemes have been carried oat which explain when such schemes are likely to succeed in discovering oplmal solutions (Bmnchini cl al.. 1994; Gori & Tesi, 1992; Yu, 192), and to gencrali7c to new examples (Baum & Haussler. 1989L These results give st>me ...
A Unified GradientDescent/Clustering Architecture for Finite State Machine Induction
 NIPS
, 1994
"... Although recurrent neural nets have been moderately successful in learning to emulate finitestate machines (FSMs), the continuous internal state dynamics of a neural net are not well matched to the discrete behavior of an FSM. We describe an architecture, called DOLCE, that allows discrete states t ..."
Abstract

Cited by 32 (0 self)
 Add to MetaCart
Although recurrent neural nets have been moderately successful in learning to emulate finitestate machines (FSMs), the continuous internal state dynamics of a neural net are not well matched to the discrete behavior of an FSM. We describe an architecture, called DOLCE, that allows discrete states to evolve in a net as learning progresses. dolce consists of a standard recurrent neural net trained by gradient descent and an adaptive clustering technique that quantizes the state space. dolce is based on the assumption that a finite set of discrete internal states is required for the task, and that the actual network state belongs to this set but has been corrupted by noise due to inaccuracy in the weights. dolce learns to recover the discrete state with maximum a posteriori probability from the noisy state. Simulations show that dolce leads to a significant improvement in generalization performance over earlier neural net approaches to FSM induction.
Analysis of Dynamical Recognizers
 NEURAL COMPUTATION
, 1996
"... Pollack (1991) demonstrated that secondorder recurrent neural networks can act as dynamical recognizers for formal languages when trained on positive and negative examples, and observed both phase transitions in learning and IFSlike fractal state sets. Followon work focused mainly on the extra ..."
Abstract

Cited by 32 (5 self)
 Add to MetaCart
Pollack (1991) demonstrated that secondorder recurrent neural networks can act as dynamical recognizers for formal languages when trained on positive and negative examples, and observed both phase transitions in learning and IFSlike fractal state sets. Followon work focused mainly on the extraction and minimization of a finite state automaton (FSA) from the trained network. However, such networks are capable of inducing languages which are not regular, and therefore not equivalenttoany FSA. Indeed, it may be simpler for a small network to fit its training data by inducing such a nonregular language. But when is the network's language not regular? In this paper, using a low dimensional network capable of learning all the Tomita data sets, we present an empirical method for testing whether the language induced by the network is regular or not. We also provide a detailed "machine analysis of trained networks for both regular and nonregular languages.
Evolving Deterministic Finite Automata Using Cellular Encoding
 Stanford University
, 1996
"... This paper presents a method for the evolution of deterministic finite automata that combines genetic programming and cellular encoding. Programs are evolved that specify actions for the incremental growth of a deterministic finite automata from an initial singlestate zygote. The results show that, ..."
Abstract

Cited by 29 (0 self)
 Add to MetaCart
This paper presents a method for the evolution of deterministic finite automata that combines genetic programming and cellular encoding. Programs are evolved that specify actions for the incremental growth of a deterministic finite automata from an initial singlestate zygote. The results show that, given a test bed of positive and negative samples, the proposed method is successful at inducing automata to recognize several different languages. 1. Introduction The automatic creation of finite automata has long been a goal of the evolutionary computation community. Fogel et. al. [1966] was the first to propose the generation of deterministic finite automata (DFAs) by means of an evolutionary process, and the possibility of inferring languages from examples was initially established by Gold [1967]. Since then, much work has been done in the induction of DFAs for language recognition. Tomita [1982] showed that hillclimbing in the space of ninestate automata was both successful and supe...