Results 1  10
of
40
On The Computational Power Of Neural Nets
 JOURNAL OF COMPUTER AND SYSTEM SCIENCES
, 1995
"... This paper deals with finite size networks which consist of interconnections of synchronously evolving processors. Each processor updates its state by applying a "sigmoidal" function to a linear combination of the previous states of all units. We prove that one may simulate all Turing Mach ..."
Abstract

Cited by 174 (23 self)
 Add to MetaCart
(Show Context)
This paper deals with finite size networks which consist of interconnections of synchronously evolving processors. Each processor updates its state by applying a "sigmoidal" function to a linear combination of the previous states of all units. We prove that one may simulate all Turing Machines by such nets. In particular, one can simulate any multistack Turing Machine in real time, and there is a net made up of 886 processors which computes a universal partialrecursive function. Products (high order nets) are not required, contrary to what had been stated in the literature. Nondeterministic Turing Machines can be simulated by nondeterministic rational nets, also in real time. The simulation result has many consequences regarding the decidability, or more generally the complexity, of questions about recursive nets.
Turing Computability With Neural Nets
 Applied Mathematics Letters
, 1991
"... . This paper shows the existence of a finite neural network, made up of sigmoidal neurons, which simulates a universal Turing machine. It is composed of less than 10 5 synchronously evolving processors, interconnected linearly. Highorder connections are not required. 1. Introduction This paper a ..."
Abstract

Cited by 82 (14 self)
 Add to MetaCart
(Show Context)
. This paper shows the existence of a finite neural network, made up of sigmoidal neurons, which simulates a universal Turing machine. It is composed of less than 10 5 synchronously evolving processors, interconnected linearly. Highorder connections are not required. 1. Introduction This paper addresses the question: What ultimate limitations, if any, are imposed by the use of neural nets as computing devices? In particular, and ignoring issues of training and practicality of implementation, one would like to know if every problem that can be solved by a digital computer is also solvable in principle using a net. This question has been asked before in the literature. Indeed, Jordan Pollack ([7]) showed that a certain recurrent net model which he called a "neuring machine," for "neural Turing" is universal. In his model, all neurons synchronously update their states according to a quadratic combination of past activation values. In general, one calls highorder nets those in...
Constructing Deterministic FiniteState Automata in Recurrent Neural Networks
 Journal of the ACM
, 1996
"... Recurrent neural networks that are trained to behave like deterministic finitestate automata (DFAs) can show deteriorating performance when tested on long strings. This deteriorating performance can be attributed to the instability of the internal representation of the learned DFA states. The use o ..."
Abstract

Cited by 75 (16 self)
 Add to MetaCart
(Show Context)
Recurrent neural networks that are trained to behave like deterministic finitestate automata (DFAs) can show deteriorating performance when tested on long strings. This deteriorating performance can be attributed to the instability of the internal representation of the learned DFA states. The use of a sigmoidal discriminant function together with the recurrent structure contribute to this instability. We prove that a simple algorithm can construct secondorder recurrent neural networks with a sparse interconnection topology and sigmoidal discriminant function such that the internal DFA state representations are stable, i.e. the constructed network correctly classifies strings of arbitrary length. The algorithm is based on encoding strengths of weights directly into the neural network. We derive a relationship between the weight strength and the number of DFA states for robust string classification. For a DFA with n states and m input alphabet symbols, the constructive algorithm genera...
Approximating the Semantics of Logic Programs by Recurrent Neural Networks
"... In [18] we have shown how to construct a 3layered recurrent neural network that computes the fixed point of the meaning function TP of a given propositional logic program P, which corresponds to the computation of the semantics of P. In this article we consider the first order case. We define a no ..."
Abstract

Cited by 62 (10 self)
 Add to MetaCart
In [18] we have shown how to construct a 3layered recurrent neural network that computes the fixed point of the meaning function TP of a given propositional logic program P, which corresponds to the computation of the semantics of P. In this article we consider the first order case. We define a notion of approximation for interpretations and prove that there exists a 3layered feed forward neural network that approximates the calculation of TP for a given first order acyclic logic program P with an injective level mapping arbitrarily well. Extending the feed forward network by recurrent connections we obtain a recurrent neural network whose iteration approximates the fixed point of TP. This result is proven by taking advantage of the fact that for acyclic logic programs the function TP is a contraction mapping on a complete metric space defined by the interpretations of the program. Mapping this space to the metric space IR with Euclidean distance, a real valued function fP can be defined which corresponds to TP and is continuous as well as a contraction. Consequently it can be approximated by an appropriately chosen class of feed forward neural networks.
On the Effect of Analog Noise in DiscreteTime Analog Computations
 Neural Computation
, 1997
"... We introduce a model for noiserobust analog computations with discrete time that is flexible enough to cover the most important concrete cases, such as computations in noisy analog neural nets and networks of noisy spiking neurons. We show that the presence of arbitrarily small amounts of analog no ..."
Abstract

Cited by 59 (15 self)
 Add to MetaCart
We introduce a model for noiserobust analog computations with discrete time that is flexible enough to cover the most important concrete cases, such as computations in noisy analog neural nets and networks of noisy spiking neurons. We show that the presence of arbitrarily small amounts of analog noise reduces the power of analog computational models to that of finite automata, and we also prove a new type of upper bound for the VCdimension of computational models with analog noise. 1 Introduction Analog noise is a serious issue in practical analog computation. However there exists no formal model for reliable computations by noisy analog systems which allows us to address this issue in an adequate manner. The investigation of noisetolerant digital computations in the presence of stochastic failures of gates or wires had been initiated by [von Neumann, 1956]. We refer to [Cowan, 1966] and [Pippenger, 1989] for a small sample of the numerous results that have been achieved in this d...
Computational Capabilities of Recurrent NARX Neural Networks
 IEEE Trans. on Systems, Man and Cybernetics
, 1997
"... Abstract—Recently, fully connected recurrent neural networks have been proven to be computationally rich—at least as powerful as Turing machines. This work focuses on another network which is popular in control applications and has been found to be very effective at learning a variety of problems. T ..."
Abstract

Cited by 51 (9 self)
 Add to MetaCart
(Show Context)
Abstract—Recently, fully connected recurrent neural networks have been proven to be computationally rich—at least as powerful as Turing machines. This work focuses on another network which is popular in control applications and has been found to be very effective at learning a variety of problems. These networks are based upon Nonlinear AutoRegressive models with eXogenous Inputs (NARX models), and are therefore called NARX networks. As opposed to other recurrent networks, NARX networks have a limited feedback which comes only from the output neuron rather than from hidden states. They are formalized by y(t) =9(u(t0nu);111;u(t01); u(t);y(t0ny);111;y(t01)) where u(t) and y(t) represent input and output of the network at time t, nu and ny are the input and output order, and the function 9 is the mapping performed by a Multilayer Perceptron. We constructively prove that the NARX networks with a finite number of parameters are computationally as strong as fully connected recurrent networks and thus Turing machines. We conclude that in theory one can use the NARX models, rather than conventional recurrent networks without any computational loss even though their feedback is limited. Furthermore, these results raise the issue of what amount of feedback or recurrence is necessary for any network to be Turing equivalent and what restrictions on feedback limit computational power. I.
Extracting and Learning an Unknown Grammar with Recurrent Neural Networks
, 1992
"... Simple secondorder recurrent networks are shown to readily learn small known regular grammars when trained with positive and negative strings examples. We show that similar methods are appropriate for learning unknown grammars from examples of their strings. The training algorithm is an incrementa ..."
Abstract

Cited by 44 (12 self)
 Add to MetaCart
Simple secondorder recurrent networks are shown to readily learn small known regular grammars when trained with positive and negative strings examples. We show that similar methods are appropriate for learning unknown grammars from examples of their strings. The training algorithm is an incremental realtime, recurrent learning (RTRL) method that computes the complete gradient and updates the weights at the end of each string. After or during training, a dynamic clustering algorithm extracts the production rules that the neural network has learned. The methods are illustrated by extracting rules from unknown deterministic regular grammars. For many cases the extracted grammar outperforms the neural net from which it was extracted in correctly classifying unseen strings. 1 INTRODUCTION For many reasons, there has been a long interest in "language" models of neural networks; see [Elman 1991] for an excellent discussion. The orientation of this work is somewhat different. The focus her...
Finite State Machines and Recurrent Neural Networks  Automata and Dynamical Systems Approaches
 Neural Networks and Pattern Recognition
, 1998
"... We present two approaches to the analysis of the relationship between a recurrent neural network (RNN) and the finite state machine M the network is able to exactly mimic. First, the network is treated as a state machine and the relationship between the RNN and M is established in the context of alg ..."
Abstract

Cited by 29 (11 self)
 Add to MetaCart
(Show Context)
We present two approaches to the analysis of the relationship between a recurrent neural network (RNN) and the finite state machine M the network is able to exactly mimic. First, the network is treated as a state machine and the relationship between the RNN and M is established in the context of algebraic theory of automata. In the second approach, the RNN is viewed as a set of discretetime dynamical systems associated with input symbols of M. In particular, issues concerning network representation of loops and cycles in the state transition diagram of M are shown to provide a basis for the interpretation of learning process from the point of view of bifurcation analysis. The circumstances under which a loop corresponding to an input symbol x is represented by an attractive fixed point of the underlying dynamical system associated with x are investigated. For the case of two recurrent neurons, under some assumptions on weight values, bifurcations can be understood in the geometrical c...
Constructive Learning of Recurrent Neural Networks: Limitations of Recurrent Casade Correlation and a Simple Solution
, 1993
"... It is often difficult to predict the optimal neural network size for a particular application. Constructive or destructive methods that add or subtract neurons, layers, connections, etc. might offer a solution to this problem. We prove that one method, Recurrent Cascade Correlation, due to its topol ..."
Abstract

Cited by 29 (9 self)
 Add to MetaCart
It is often difficult to predict the optimal neural network size for a particular application. Constructive or destructive methods that add or subtract neurons, layers, connections, etc. might offer a solution to this problem. We prove that one method, Recurrent Cascade Correlation, due to its topology, has fundamental limitations in representation and thus in its learning capabilities. It cannot represent with monotone (i.e. sigmoid) and hardthreshold activation functions certain finite state automata. We give a "preliminary" approach on how to get around these limitations by devising a simple constructive training method that adds neurons during training while still preserving the powerful fullyrecurrent structure. We illustrate this approach by simulations which learn many examples of regular grammars that the Recurrent Cascade Correlation method is unable to learn. 1 Introduction Choosing the architecture of a neural network for a particular problem usually requires some prior k...
FirstOrder vs. SecondOrder Single Layer Recurrent Neural Networks
 IEEE TRANSACTIONS ON NEURAL NETWORKS
, 1994
"... We examine the representational capabilities of firstorder and secondorder Single Layer Recurrent Neural Networks (SLRNNs) with hardlimiting neurons. We show that a secondorder SLRNN is strictly more powerful than a firstorder SLRNN. However, if the firstorder SLRNN is augmented with output lay ..."
Abstract

Cited by 27 (4 self)
 Add to MetaCart
(Show Context)
We examine the representational capabilities of firstorder and secondorder Single Layer Recurrent Neural Networks (SLRNNs) with hardlimiting neurons. We show that a secondorder SLRNN is strictly more powerful than a firstorder SLRNN. However, if the firstorder SLRNN is augmented with output layers of feedforward neurons, it can implement any finitestate recognizer, but only if statesplitting is employed. When a state is split, it is divided into two equivalent states. The judicious use of statesplitting allows for efficient implementation of finitestate recognizers using augmented firstorder SLRNNs.