Results 1  10
of
150
Learning Stochastic Regular Grammars by Means of a State Merging Method
, 1994
"... We propose a new Mgorithm which allows for the identification of any stochastic deterministic regular language as well as the determination of the probabilities of the strings in the language. The algorithm builds the prefix tree acceptor from the sample set and merges systematically equivaJent stat ..."
Abstract

Cited by 137 (13 self)
 Add to MetaCart
We propose a new Mgorithm which allows for the identification of any stochastic deterministic regular language as well as the determination of the probabilities of the strings in the language. The algorithm builds the prefix tree acceptor from the sample set and merges systematically equivaJent states. Experimentally, it proves very fast a.ad the time needed grows only linearly with the size of the sample set.
Toward a connectionist model of recursion in human linguistic performance
 Cognitive Science
, 1999
"... Naturally occurring speech contains only a limited amount of complex recursive structure, and this is reflected in the empirically documented difficulties that people experience when processing such structures. We present a connectionist model of human performance in processing recursive language st ..."
Abstract

Cited by 126 (19 self)
 Add to MetaCart
Naturally occurring speech contains only a limited amount of complex recursive structure, and this is reflected in the empirically documented difficulties that people experience when processing such structures. We present a connectionist model of human performance in processing recursive language structures. The model is trained on simple artificial languages. We find that the qualitative performance profile of the model matches human behavior, both on the relative difficulty of centerembedding and crossdependency, and between the processing of these complex recursive structures and rightbranching recursive constructions. We analyze how these differences in performance are reflected in the internal representations of the model by performing discriminant analyses on these representations both before and after training. Furthermore, we show how a network trained to process recursive structures can also generate such structures in a probabilistic fashion. This work suggests a novel explanation of people’s limited recursive performance, without assuming the existence of a mentally represented competence grammar allowing unbounded recursion. I.
A General Framework for Adaptive Processing of Data Structures
 IEEE TRANSACTIONS ON NEURAL NETWORKS
, 1998
"... A structured organization of information is typically required by symbolic processing. On the other hand, most connectionist models assume that data are organized according to relatively poor structures, like arrays or sequences. The framework described in this paper is an attempt to unify adaptive ..."
Abstract

Cited by 117 (46 self)
 Add to MetaCart
A structured organization of information is typically required by symbolic processing. On the other hand, most connectionist models assume that data are organized according to relatively poor structures, like arrays or sequences. The framework described in this paper is an attempt to unify adaptive models like artificial neural nets and belief nets for the problem of processing structured information. In particular, relations between data variables are expressed by directed acyclic graphs, where both numerical and categorical values coexist. The general framework proposed in this paper can be regarded as an extension of both recurrent neural networks and hidden Markov models to the case of acyclic graphs. In particular we study the supervised learning problem as the problem of learning transductions from an input structured space to an output structured space, where transductions are assumed to admit a recursive hidden statespace representation. We introduce a graphical formalism for r...
Exploiting the Past and the Future in Protein Secondary Structure Prediction
, 1999
"... Motivation: Predicting the secondary structure of a protein (alphahelix, betasheet, coil) is an important step towards elucidating its three dimensional structure, as well as its function. Presently, the best predictors are based on machine learning approaches, in particular neural network archite ..."
Abstract

Cited by 116 (22 self)
 Add to MetaCart
Motivation: Predicting the secondary structure of a protein (alphahelix, betasheet, coil) is an important step towards elucidating its three dimensional structure, as well as its function. Presently, the best predictors are based on machine learning approaches, in particular neural network architectures with a fixed, and relatively short, input window of amino acids, centered at the prediction site. Although a fixed small window avoids overfitting problems, it does not permit to capture variable longranged information. Results: We introduce a family of novel architectures which can learn to make predictions based on variable ranges of dependencies. These architectures extend recurrent neural networks, introducing noncausal bidirectional dynamics to capture both upstream and downstream information. The prediction algorithm is completed by the use of mixtures of estimators that leverage evolutionary information, expressed in terms of multiple alignments, both at the input and output levels. While our system currently achieves an overall performance close to 76% correct predictionat least comparable to the best existing systemsthe main emphasis here is on the development of new algorithmic ideas. Availability: The executable program for predicting protein secondary structure is available from the authors free of charge. Contact: pfbaldi@ics.uci.edu, gpollast@ics.uci.edu, brunak@cbs.dtu.dk, paolo@dsi.unifi.it. 1
An Input Output HMM Architecture
 ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS
, 1995
"... We introduce a recurrent architecture having a modular structure and we formulate a training procedure based on the EM algorithm. The resulting model has similarities to hidden Markov models, but supports recurrent networks processing style and allows to exploit the supervised learning paradigm ..."
Abstract

Cited by 108 (15 self)
 Add to MetaCart
We introduce a recurrent architecture having a modular structure and we formulate a training procedure based on the EM algorithm. The resulting model has similarities to hidden Markov models, but supports recurrent networks processing style and allows to exploit the supervised learning paradigm while using maximum likelihood estimation.
Input/output hmms for sequence processing
 IEEE Transactions on Neural Networks
, 1996
"... We consider problems of sequence processing and propose a solution based on a discrete state model in order to represent past context. Weintroduce a recurrent connectionist architecture having a modular structure that associates a subnetwork to each state. The model has a statistical interpretation ..."
Abstract

Cited by 98 (12 self)
 Add to MetaCart
We consider problems of sequence processing and propose a solution based on a discrete state model in order to represent past context. Weintroduce a recurrent connectionist architecture having a modular structure that associates a subnetwork to each state. The model has a statistical interpretation we call Input/Output Hidden Markov Model (IOHMM). It can be trained by the EM or GEM algorithms, considering state trajectories as missing data, which decouples temporal credit assignment and actual parameter estimation. The model presents similarities to hidden Markov models (HMMs), but allows us to map input sequences to output sequences, using the same processing style as recurrent neural networks. IOHMMs are trained using a more discriminant learning paradigm than HMMs, while potentially taking advantage of the EM algorithm. We demonstrate that IOHMMs are well suited for solving grammatical inference problems on a benchmark problem. Experimental results are presented for the seven Tomita grammars, showing that these adaptive models can attain excellent generalization.
Analog Computation via Neural Networks
 THEORETICAL COMPUTER SCIENCE
, 1994
"... We pursue a particular approach to analog computation, based on dynamical systems of the type used in neural networks research. Our systems have a fixed structure, invariant in time, corresponding to an unchanging number of "neurons". If allowed exponential time for computation, they turn out to ha ..."
Abstract

Cited by 87 (8 self)
 Add to MetaCart
We pursue a particular approach to analog computation, based on dynamical systems of the type used in neural networks research. Our systems have a fixed structure, invariant in time, corresponding to an unchanging number of "neurons". If allowed exponential time for computation, they turn out to have unbounded power. However, under polynomialtime constraints there are limits on their capabilities, though being more powerful than Turing Machines. (A similar but more restricted model was shown to be polynomialtime equivalent to classical digital computation in the previous work [20].) Moreover, there is a precise correspondence between nets and standard nonuniform circuits with equivalent resources, and as a consequence one has lower bound constraints on what they can compute. This relationship is perhaps surprising since our analog devices do not change in any manner with input size. We note that these networks are not likely to solve polynomially NPhard problems, as the equality ...
Constructing Deterministic FiniteState Automata in Recurrent Neural Networks
 Journal of the ACM
, 1996
"... Recurrent neural networks that are trained to behave like deterministic finitestate automata (DFAs) can show deteriorating performance when tested on long strings. This deteriorating performance can be attributed to the instability of the internal representation of the learned DFA states. The use o ..."
Abstract

Cited by 70 (16 self)
 Add to MetaCart
Recurrent neural networks that are trained to behave like deterministic finitestate automata (DFAs) can show deteriorating performance when tested on long strings. This deteriorating performance can be attributed to the instability of the internal representation of the learned DFA states. The use of a sigmoidal discriminant function together with the recurrent structure contribute to this instability. We prove that a simple algorithm can construct secondorder recurrent neural networks with a sparse interconnection topology and sigmoidal discriminant function such that the internal DFA state representations are stable, i.e. the constructed network correctly classifies strings of arbitrary length. The algorithm is based on encoding strengths of weights directly into the neural network. We derive a relationship between the weight strength and the number of DFA states for robust string classification. For a DFA with n states and m input alphabet symbols, the constructive algorithm genera...
Extracting Comprehensible Models from Trained Neural Networks
, 1996
"... To Mom, Dad, and Susan, for their support and encouragement. ..."
Abstract

Cited by 69 (4 self)
 Add to MetaCart
To Mom, Dad, and Susan, for their support and encouragement.
Extraction of Rules from Discretetime Recurrent Neural Networks
, 1996
"... The extraction of symbolic knowledge from trained neural networks and the direct encoding of (partial) knowledge into networks prior to training are important issues. They allow the exchange of information between symbolic and connectionist knowledge representations. The focas of this paper is on t ..."
Abstract

Cited by 61 (15 self)
 Add to MetaCart
The extraction of symbolic knowledge from trained neural networks and the direct encoding of (partial) knowledge into networks prior to training are important issues. They allow the exchange of information between symbolic and connectionist knowledge representations. The focas of this paper is on the quality of the rules that are extracted from recurrent neural networks. Discretetime recurrent neural networks can be trained to correctly classify strings of a regular language. Rules defining the learned grammar can be extracted from networks in the form of deterministic finitestate automata (DFAs) by applying clustering algorithms in the output space of recurrent state neurons. Our algorithm can extract different finitestate automata that are consistent with a training set from the same network. We compare the generalization performances of these different models and the trained network and we introduce a heuristic that permits us to choose among the consistent DFAs the model which best approximates the learned regular grammar.