## An Overview Of The Computational Power Of Recurrent Neural Networks (2000)

Venue: | Proceedings of the 9th Finnish AI Conference STeP 2000{Millennium of AI, Espoo, Finland (Vol. 3: "AI of Tomorrow": Symposium on Theory, Finnish AI Society |

Citations: | 10 - 2 self |

### BibTeX

@INPROCEEDINGS{Orponen00anoverview,

author = {Pekka Orponen},

title = {An Overview Of The Computational Power Of Recurrent Neural Networks},

booktitle = {Proceedings of the 9th Finnish AI Conference STeP 2000{Millennium of AI, Espoo, Finland (Vol. 3: "AI of Tomorrow": Symposium on Theory, Finnish AI Society},

year = {2000},

pages = {89--96},

publisher = {Finnish AI Society}

}

### OpenURL

### Abstract

INTRODUCTION The two main streams of neural networks research consider neural networks either as a powerful family of nonlinear statistical models, to be used in for example pattern recognition applications [6], or as formal models to help develop a computational understanding of the brain [10]. Historically, the brain theory interest was primary [32], but with the advances in computer technology, the application potential of the statistical modeling techniques has shifted the balance. 1 The study of neural networks as general computational devices does not strictly follow this division of interests: rather, it provides a general framework outlining the limitations and possibilities aecting both research domains. The prime historic example here is obviously Minsky's and Papert's 1969 study of the computational limitations of singlelayer perceptrons [34], which was a major inuence in turning away interest from neural network learning to symbolic AI techniques for more

### Citations

5575 |
Neural Network for Pattern Recognition
- Bishop
- 1995
(Show Context)
Citation Context ...CTION The two main streams of neural networks research consider neural networks either as a powerful family of nonlinear statistical models, to be used in for example pattern recognition applications =-=[6]-=-, or as formal models to help develop a computational understanding of the brain [10]. Historically, the brain theory interest was primary [32], but with the advances in computer technology, the appli... |

890 |
A logical calculus of the ideas immanent in nervous activity
- McCulloch, Pitts
- 1943
(Show Context)
Citation Context ...o be used in for example pattern recognition applications [6], or as formal models to help develop a computational understanding of the brain [10]. Historically, the brain theory interest was primary =-=[32]-=-, but with the advances in computer technology, the application potential of the statistical modeling techniques has shifted the balance. 1 The study of neural networks as general computational device... |

576 |
Perceptrons: an introduction to computational geometry
- Minsky, Papert
- 1968
(Show Context)
Citation Context ...limitations and possibilities aecting both research domains. The prime historic example here is obviously Minsky's and Papert's 1969 study of the computational limitations of singlelayer perceptrons [=-=34-=-], which was a major in uence in turning away interest from neural network learning to symbolic AI techniques for more than a decade. A less dramatic, but at least as signicant example is Kleene's 195... |

381 |
The Computational Brain
- Churchland, Sejnowski
- 1992
(Show Context)
Citation Context ...er as a powerful family of nonlinear statistical models, to be used in for example pattern recognition applications [6], or as formal models to help develop a computational understanding of the brain =-=[10]-=-. Historically, the brain theory interest was primary [32], but with the advances in computer technology, the application potential of the statistical modeling techniques has shifted the balance. 1 Th... |

370 | The Complexity of Boolean Functions
- Wegener
- 1987
(Show Context)
Citation Context ...the computational power of recurrent (cyclic) neural networks. The computational study of feedforward (acyclic) networks has intimate connections to the classical theory of Boolean circuit complexity =-=[53]-=-, and is surveyed brie y in the articles [36, 40], and at greater depth in the books [41, 43, 52]. 2. BASIC NOTIONS AND RESULTS With the brief exception of Section 6 on continuous-time models, we shal... |

235 |
Pulsed Neural Networks
- Maass, Bishop
- 2001
(Show Context)
Citation Context ...ing recent instance of an ambitious research programme aspiring to connect computation theory to real biology is Wolfgang Maass's work on the computational capabilities of pulse coded neural networks =-=[29, 30]-=-. In this overview, we focus on the computational power of recurrent (cyclic) neural networks. The computational study of feedforward (acyclic) networks has intimate connections to the classical theor... |

160 |
Neural Networks and Analog Computation: Beyond the Turing Limit
- Siegelmann
- 1999
(Show Context)
Citation Context ...tional power of analog Hopeld nets with an external clock is the same as that of asymmetric analog networks. The computational power of discrete-time analog networks is further discussed in the book [=-=45]-=-. However, the theoretical fascination of this topic is tempered by the practical observation that simulating arbitrarily long computations on a singlesnite network requires arbitrary-precision real n... |

123 | Neuro-computing foundations of research - JA, Rosenfeld - 1990 |

108 |
The Emotion Machine
- Minsky
- 2006
(Show Context)
Citation Context ...ivalent tosnite automata for processing sequentially given inputs. A somewhat interesting question here is how ecient are neural nets as representations ofsnite automata. The elementary constructions =-=[33]-=- yield a network of about 2m binary-state neurons for simulating an m-state automaton, and it was proved only relatively recently in [1] that at least a m log m) 1=3 ) neurons are really required in t... |

103 |
The dynamics of discrete-time computation, with application to recurrent neural networks and finite state machine extraction
- Casey
- 1996
(Show Context)
Citation Context ...ulating arbitrarily long computations on a singlesnite network requires arbitrary-precision real number calculations, and these are in practice unavoidably corrupted by noise. In fact, it is shown in =-=[9, 31]-=- that any small amount of noise reduces the computational power of analog recurrent networks back to that ofsnite automata. 6. CONTINUOUS-TIME NETWORKS In a continuous-time network, discussed for inst... |

91 | Analog computation via neural networks
- Siegelmann, Sontag
- 1994
(Show Context)
Citation Context ...l updates, symmetric binary networks have the same computational capabilities as the other models considered. 5. DISCRETE-TIME ANALOG NETWORKS An interesting result proved by Siegelmann and Sontag in =-=[46, 47]-=- shows that if one moves from binary-state to analog-state neurons, then arbitrary Turing machines may be simulated by single,snite recurrent networks. The original construction in [46] required 1058 ... |

83 |
Simple local search problems that are hard to solve
- Schäffer, Yannakakis
- 1991
(Show Context)
Citation Context ...the construction is reviewed in [37]. The result can also be shown to follow, although in a somewhat convoluted way, from the more general theory of \PLScompleteness " for local optimization prob=-=lems [44-=-]. one for each input size. If arbitrary changes to the network structure are possible, then this model is fundamentally dierent fromsnite automata. Since the computations of a binary recurrent net of... |

57 |
Circuit complexity and neural networks
- Parberry
- 1994
(Show Context)
Citation Context ... feedforward (acyclic) networks has intimate connections to the classical theory of Boolean circuit complexity [53], and is surveyed brie y in the articles [36, 40], and at greater depth in the books =-=[41, 43, 52]-=-. 2. BASIC NOTIONS AND RESULTS With the brief exception of Section 6 on continuous-time models, we shall be concerned withsnite discrete-time recurrent networks. Such a network consists of n computati... |

55 |
Computability with lowdimensional dynamical systems
- Koiran, Cosnard, et al.
- 1994
(Show Context)
Citation Context ...gle,snite recurrent networks. The original construction in [46] required 1058 saturated-linear neurons to simulate a universal Turing machine, but this has later been improved to at least 114 neurons =-=[27]-=-, and even to 25 neurons [23]. The starting point in these constructions, and also in many other recent simulations of Turing machines bysnite-dimensional dynamical systems (e.g. [3, 7, 27, 35]), is t... |

55 |
Computational power of neural networks
- Siegelmann, Sontag
- 1995
(Show Context)
Citation Context ...l updates, symmetric binary networks have the same computational capabilities as the other models considered. 5. DISCRETE-TIME ANALOG NETWORKS An interesting result proved by Siegelmann and Sontag in =-=[46, 47]-=- shows that if one moves from binary-state to analog-state neurons, then arbitrary Turing machines may be simulated by single,snite recurrent networks. The original construction in [46] required 1058 ... |

53 |
Turing machines that take advice, L’Enseignement Mathématique IIe Série, Tome XXVIII
- Karp, Lipton
- 1982
(Show Context)
Citation Context ...des with the class P/poly of functions computable by \nonuniform" Turing machines in polynomial time. This is basically the standard complexity class P, with a certain technical proviso introduce=-=d in [24-=-] to account for changing the machine structure for dierent input sizes. More interesting results can be obtained concerning unbounded time computations by binary recurrent networks. A folklore result... |

52 | On the effect of analog noise in discrete-time analog computations
- Maass, Orponen
- 1998
(Show Context)
Citation Context ...ulating arbitrarily long computations on a singlesnite network requires arbitrary-precision real number calculations, and these are in practice unavoidably corrupted by noise. In fact, it is shown in =-=[9, 31]-=- that any small amount of noise reduces the computational power of analog recurrent networks back to that ofsnite automata. 6. CONTINUOUS-TIME NETWORKS In a continuous-time network, discussed for inst... |

50 |
Neural networks and physical systems with emergent collective computational abilities
- Hop
- 1982
(Show Context)
Citation Context ...ymmetric sequential nets to guarantee the Liapunov property.) Without loss of generality [41], one may also assume non-zero excitations (t) j 6= 0, j = 1; : : : ; n. It was then observed by Hopeld [20] (see also [11, 16]) that the following energy function E y (t) = E(t) = 1 2 n X j=1 n X i=1 w ji y (t) i y (t) j ; has the property that E(t) E(t 1) 1 for every update step t 1 of a productiv... |

45 | Paradigms for computing with spiking neurons
- Maass
- 1999
(Show Context)
Citation Context ...ing recent instance of an ambitious research programme aspiring to connect computation theory to real biology is Wolfgang Maass's work on the computational capabilities of pulse coded neural networks =-=[29, 30]-=-. In this overview, we focus on the computational power of recurrent (cyclic) neural networks. The computational study of feedforward (acyclic) networks has intimate connections to the classical theor... |

36 |
Bounds on the complexity of recurrent neural network implementations of finite state machines
- Horne, Hush
- 1996
(Show Context)
Citation Context ...utomaton, and it was proved only relatively recently in [1] that at least a m log m) 1=3 ) neurons are really required in the worst case.t The upper bound was further improved to O(m 1=2 ) neurons in =-=[22, 23]-=-, and it was shown that under some additional constraints this upper bound is tight. The case of sequence processing by symmetric binary-state networks was considered in [48, 51], where it was shown t... |

30 | On some relations between dynamical systems and transition systems
- Asarin, Maler
- 1994
(Show Context)
Citation Context ...east 114 neurons [27], and even to 25 neurons [23]. The starting point in these constructions, and also in many other recent simulations of Turing machines bysnite-dimensional dynamical systems (e.g. =-=[3, 7, 27, 35]-=-), is the well-known correspondence of Turing machines and two-stack pushdown automata. The Turing machine tape issrst represented as two opposing stacks, and then the contents of these stacks are enc... |

28 |
A primer on the complexity theory of neural networks
- Parberry
- 1990
(Show Context)
Citation Context ... neural networks. The computational study of feedforward (acyclic) networks has intimate connections to the classical theory of Boolean circuit complexity [53], and is surveyed brie y in the articles =-=[36, 40]-=-, and at greater depth in the books [41, 43, 52]. 2. BASIC NOTIONS AND RESULTS With the brief exception of Section 6 on continuous-time models, we shall be concerned withsnite discrete-time recurrent ... |

27 |
Neural and Automata Networks
- Goles, Mart
- 1990
(Show Context)
Citation Context ... also possible. For example, under sequential mode only one neuron updates its state at each time instant. Also various block-parallel dynamics can be considered, but we do not discuss them here (see =-=[12, 15]). A-=- fundamental property of symmetric networks is that their dynamics are constrained by Liapunov, or \energy" functions. A Liapunov function E is a bounded real-valued function dened on the state s... |

24 |
Optimal simulation of automata by neural nets
- Indyk
- 1995
(Show Context)
Citation Context ...utomaton, and it was proved only relatively recently in [1] that at least a m log m) 1=3 ) neurons are really required in the worst case.t The upper bound was further improved to O(m 1=2 ) neurons in =-=[22, 23]-=-, and it was shown that under some additional constraints this upper bound is tight. The case of sequence processing by symmetric binary-state networks was considered in [48, 51], where it was shown t... |

23 |
Steepest descent can take exponential time for symmetric connectionist networks
- Haken, Luby
- 1988
(Show Context)
Citation Context ...hat networks with exponentially large weights may indeed require an exponential time to converge. This wassrst shown in [17] for synchronous updates (a simplied construction appears in [14]), and in [=-=19-=-] for a particular sequential update rule. (For a dierent sequential rule the result actually follows already from [17] or [14] by a fairly simple construction.) Finally, a network requiring exponenti... |

23 |
Representation of events in in nerve nets and automata
- Kleene
- 1956
(Show Context)
Citation Context ...was a major in uence in turning away interest from neural network learning to symbolic AI techniques for more than a decade. A less dramatic, but at least as signicant example is Kleene's 1956 paper [=-=25]-=- presenting an algebraic characterization of the computations feasible insnite McCulloch{Pitts neural networks | and thereby introducing the notion of regular expressions and their connection tosnite ... |

23 | Computational complexity of neural networks: A survey
- Orponen
- 1994
(Show Context)
Citation Context ... neural networks. The computational study of feedforward (acyclic) networks has intimate connections to the classical theory of Boolean circuit complexity [53], and is surveyed brie y in the articles =-=[36, 40]-=-, and at greater depth in the books [41, 43, 52]. 2. BASIC NOTIONS AND RESULTS With the brief exception of Section 6 on continuous-time models, we shall be concerned withsnite discrete-time recurrent ... |

22 |
Neurons with graded response have collective properties like those of two-state neurons
- Hop
- 1984
(Show Context)
Citation Context ...unt of noise reduces the computational power of analog recurrent networks back to that ofsnite automata. 6. CONTINUOUS-TIME NETWORKS In a continuous-time network, discussed for instance by Hopeld in [=-=21-=-], the dynamics of the network state y (t) = (y (t) 1 ; : : : ; y (t) n ) 2 [0; 1] n is determined for every real t > 0 by the following system of dierential equations, with the initial network state ... |

21 |
Exponential transient classes of symmetric neural networks for synchronous and sequential updating
- Goles, Martínez
- 1989
(Show Context)
Citation Context ...can be shown that networks with exponentially large weights may indeed require an exponential time to converge. This wassrst shown in [17] for synchronous updates (a simplied construction appears in [=-=14-=-]), and in [19] for a particular sequential update rule. (For a dierent sequential rule the result actually follows already from [17] or [14] by a fairly simple construction.) Finally, a network requi... |

21 |
On periodical behaviour in societies with symmetric influences
- Poljak, ˚ura, et al.
- 1983
(Show Context)
Citation Context ...cur, within O(W ) update steps. An analogous result can be shown for synchronous dynamics in binary nets; however in this case the network may also converge to a limit cycle of two alternating states =-=[8, 13, 42]-=-. It also follows from these considerations that if the weights w ij in a symmetric network are polynomially bounded in the number of units n, then the network converges in polynomial time. Conversely... |

19 | Computing with truly asynchronous threshold logic networks
- Orponen
- 1997
(Show Context)
Citation Context ...res that the network grows with increasing input size, i.e., that we actually consider nonuniform sequences of networks, 2 The manuscript [18] remains unpublished, but the construction is reviewed in =-=[37]. The-=- result can also be shown to follow, although in a somewhat convoluted way, from the more general theory of \PLScompleteness " for local optimization problems [44]. one for each input size. If ar... |

18 |
Computational power for networks of threshold devices in asynchronous environment (Tech. Rep
- Lepley, Miller
- 1983
(Show Context)
Citation Context ...ucture for dierent input sizes. More interesting results can be obtained concerning unbounded time computations by binary recurrent networks. A folklore result, apparentlysrst formulated in print in [=-=28]-=-, states that polynomially space-bounded Turing machines can be simulated by polynomialsize binary recurrent nets. The idea of the construction is as follows: one starts with the standard simulation o... |

17 |
Decreasing energy functions as a tool for studying threshold networks. Discrete Applied Mathematics 12
- Goles, Fogelman-Soulie, et al.
- 1985
(Show Context)
Citation Context ...cur, within O(W ) update steps. An analogous result can be shown for synchronous dynamics in binary nets; however in this case the network may also converge to a limit cycle of two alternating states =-=[8, 13, 42]-=-. It also follows from these considerations that if the weights w ij in a symmetric network are polynomially bounded in the number of units n, then the network converges in polynomial time. Conversely... |

17 |
Connectionist networks that need exponential time to stabilize. Unpublished manuscript
- Haken
- 1989
(Show Context)
Citation Context ... result actually follows already from [17] or [14] by a fairly simple construction.) Finally, a network requiring exponential time for convergence under any sequential update rule was demonstrated in =-=[18]-=-. 2 3. FINITE BINARY-STATE NETWORKS It has been known since the early work of McCulloch and Pitts [32] and Kleene [25] thatsnite binary-state neural networks are equivalent tosnite automata for proces... |

15 |
Analog computation with continuous ODEs, in
- Branicky
- 1994
(Show Context)
Citation Context ...east 114 neurons [27], and even to 25 neurons [23]. The starting point in these constructions, and also in many other recent simulations of Turing machines bysnite-dimensional dynamical systems (e.g. =-=[3, 7, 27, 35]-=-), is the well-known correspondence of Turing machines and two-stack pushdown automata. The Turing machine tape issrst represented as two opposing stacks, and then the contents of these stacks are enc... |

15 |
Transient length in sequential iterations of threshold functions
- Fogelman, Goles, et al.
- 1983
(Show Context)
Citation Context ...tial nets to guarantee the Liapunov property.) Without loss of generality [41], one may also assume non-zero excitations (t) j 6= 0, j = 1; : : : ; n. It was then observed by Hopeld [20] (see also [1=-=1,-=- 16]) that the following energy function E y (t) = E(t) = 1 2 n X j=1 n X i=1 w ji y (t) i y (t) j ; has the property that E(t) E(t 1) 1 for every update step t 1 of a productive computation. More... |

15 |
Dynamics of discrete time, continuous state Hopfield networks
- Koiran
- 1994
(Show Context)
Citation Context ...computational power of symmetric analog networks. A Liapunov-function argument applies here too to show that under fully parallel updates such networks converge to a limit cycle of length at most two =-=[26]-=-. In this case the only possibility for a general Turing machine simulation on a single network would be to exploitsner andsner distinctions among a sequence of network states converging to a limit cy... |

14 | The computational power of continuous time neural networks
- Orponen
- 1997
(Show Context)
Citation Context ...symmetric continuous-time networks [21]. Because of the simultaneous evolution of the neuron states, the dynamics of continuoustime networks are quite dicult to control. Nevertheless, it was shown in =-=[39]-=- that asymmetric continuous-time networks based on the saturated-linear activation function can simulate asymmetric binary-state networks with a linear size overhead, and in [49] that the same holds a... |

13 |
Comportement periodique des fonctions a seuil binaires et applications
- Goles, Olivos
- 1981
(Show Context)
Citation Context ...tial nets to guarantee the Liapunov property.) Without loss of generality [41], one may also assume non-zero excitations (t) j 6= 0, j = 1; : : : ; n. It was then observed by Hopeld [20] (see also [1=-=1,-=- 16]) that the following energy function E y (t) = E(t) = 1 2 n X j=1 n X i=1 w ji y (t) i y (t) j ; has the property that E(t) E(t 1) 1 for every update step t 1 of a productive computation. More... |

13 |
The convergence of symmetric threshold automata
- Goles, Olivos
- 1981
(Show Context)
Citation Context ... n, then the network converges in polynomial time. Conversely, it can be shown that networks with exponentially large weights may indeed require an exponential time to converge. This wassrst shown in =-=[1-=-7] for synchronous updates (a simplied construction appears in [14]), and in [19] for a particular sequential update rule. (For a dierent sequential rule the result actually follows already from [17] ... |

12 |
Theory of neuromata
- ˇSíma, Wiedermann
- 1998
(Show Context)
Citation Context ... O(m 1=2 ) neurons in [22, 23], and it was shown that under some additional constraints this upper bound is tight. The case of sequence processing by symmetric binary-state networks was considered in =-=[48, 51-=-], where it was shown that this model is properly weaker thansnite automata, and the respective subclass of the regular languages, so called Hopeld languages was characterized. More precisely, it was ... |

10 |
Automata Networks in Computer Science: Theory and Applications
- Souli'e, Robert, et al.
- 1987
(Show Context)
Citation Context ... also possible. For example, under sequential mode only one neuron updates its state at each time instant. Also various block-parallel dynamics can be considered, but we do not discuss them here (see =-=[12, 15]). A-=- fundamental property of symmetric networks is that their dynamics are constrained by Liapunov, or \energy" functions. A Liapunov function E is a bounded real-valued function dened on the state s... |

9 |
Discrete neural computation
- Siu, Roychowdhury, et al.
- 1995
(Show Context)
Citation Context ... feedforward (acyclic) networks has intimate connections to the classical theory of Boolean circuit complexity [53], and is surveyed brie y in the articles [36, 40], and at greater depth in the books =-=[41, 43, 52]-=-. 2. BASIC NOTIONS AND RESULTS With the brief exception of Section 6 on continuous-time models, we shall be concerned withsnite discrete-time recurrent networks. Such a network consists of n computati... |

8 |
On characterizations of the class PSPACE/poly
- Balc'azar, D'iaz, et al.
- 1987
(Show Context)
Citation Context ...nce time. 3 More precisely, one can show that the class of Boolean functions computed by polynomial size binary recurrent nets coincides with the nonuniform complexity class PSPACE/poly considered in =-=[4, 24]-=-. One might think that because of the Liapunov property discussed in Section 2, symmetric recurrent networks would be weaker computational devices than general asymmetric ones. For instance, symmetric... |

8 |
On the computational complexity of binary and analog symmetric Hopfield nets
- ˇSíma, Orponen, et al.
- 2000
(Show Context)
Citation Context ...ing that an arbitrary converging computation of an asymmetric binary network of n neurons can be simulated by a symmetric binary network of O(n 2 ) neurons. (The overhead was later reduced to O(n) in =-=[50-=-].) The crucial observation is that because the simulated network is binary and deterministic, any converging computation on it must terminate within 2 n steps. (Otherwise the network repeats a congur... |

7 |
The computational power of discrete Hop nets with hidden units
- Orponen
- 1996
(Show Context)
Citation Context ...rks cannot produce arbitrary oscillatory behavior, which seems to be an essential characteristic of general computation, and is also trivially created in asymmetric networks. However, it was shown in =-=[38-=-] that innite oscillations are in a sense the only feature of general-purpose (digital) computation that cannot be reproduced in symmetric recurrent networks. Even polynomial-size symmetric binary net... |

6 |
Ecient simulation of automata by neural nets
- Alon, Dewdney, et al.
- 1991
(Show Context)
Citation Context ...me instant t = 0; 1; : : :, every neuron j in the network has a well-dened state y (t) j , which in a binary-state network comes from the set f0; 1g, and in an analog-state network from the interval [=-=0; 1]-=-. We shall mostly be concerned with the synchronous fully parallel dynamics, under which the evolution of the global network state y (t) = (y (t) 1 ; : : : ; y (t) n ) 2 [0; 1] n is determined for all... |

6 |
A continuous-time Hop net simulation of discrete neural networks
- Sma, Orponen
- 2000
(Show Context)
Citation Context ...less, it was shown in [39] that asymmetric continuous-time networks based on the saturated-linear activation function can simulate asymmetric binary-state networks with a linear size overhead, and in =-=[49]-=- that the same holds also for symmetric continuous-time networks with respect to converging computations. These results establish that also polynomial-size continuous-time networks have the full compu... |

4 |
Unpredictability and undecidability in physical systems
- Moore
- 1990
(Show Context)
Citation Context ...east 114 neurons [27], and even to 25 neurons [23]. The starting point in these constructions, and also in many other recent simulations of Turing machines bysnite-dimensional dynamical systems (e.g. =-=[3, 7, 27, 35]-=-), is the well-known correspondence of Turing machines and two-stack pushdown automata. The Turing machine tape issrst represented as two opposing stacks, and then the contents of these stacks are enc... |

4 |
Hop languages
- Sma
- 1995
(Show Context)
Citation Context ... O(m 1=2 ) neurons in [22, 23], and it was shown that under some additional constraints this upper bound is tight. The case of sequence processing by symmetric binary-state networks was considered in =-=[48, 51-=-], where it was shown that this model is properly weaker thansnite automata, and the respective subclass of the regular languages, so called Hopeld languages was characterized. More precisely, it was ... |