Results 1 
7 of
7
Architectural Bias in Recurrent Neural Networks  Fractal Analysis
 IEEE TRANSACTIONS ON NEURAL NETWORKS
"... We have recently shown that when initialized with "small" weights, recurrent neural networks (RNNs) with standard sigmoidtype activation functions are inherently biased towards Markov models, i.e. even prior to any training, RNN dynamics can be readily used to extract finite memory machin ..."
Abstract

Cited by 46 (9 self)
 Add to MetaCart
(Show Context)
We have recently shown that when initialized with "small" weights, recurrent neural networks (RNNs) with standard sigmoidtype activation functions are inherently biased towards Markov models, i.e. even prior to any training, RNN dynamics can be readily used to extract finite memory machines (Hammer & Tino, 2002; Tino, Cernansky & Benuskova, 2002; Tino, Cernansky & Benuskova, 2002a). Following Christiansen and Chater (1999), we refer to this phenomenon as the architectural bias of RNNs. In this paper we further extend our work on the architectural bias in RNNs by performing a rigorous fractal analysis of recurrent activation patterns. We assume the network is driven by sequences obtained by traversing an underlying finitestate transition diagram  a scenario that has been frequently considered in the past e.g. when studying RNNbased learning and implementation of regular grammars and finitestate transducers. We obtain lower and upper bounds on various types of fractal dimensions, such as boxcounting and Hausdorff dimensions. It turns out that not only can the recurrent activations inside RNNs with small initial weights be explored to build Markovian predictive models, but also the activations form fractal clusters the dimension of which can be bounded by the scaled entropy of the underlying driving source. The scaling factors are fixed and are given by the RNN parameters.
Complex dynamics and the structure of small neural networks
 Network: Computation in Neural Systems
, 2002
"... ..."
Incremental Training of First Order Recurrent Neural Networks to Predict a ContextSensitive Language
, 2003
"... In recent years it has been shown that first order recurrent neural networks trained by gradientdescent can learn not only regular but also simple contextfree and contextsensitive languages. However, the success rate was generally low and severe instability issues were encountered. The present st ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
In recent years it has been shown that first order recurrent neural networks trained by gradientdescent can learn not only regular but also simple contextfree and contextsensitive languages. However, the success rate was generally low and severe instability issues were encountered. The present study examines the hypothesis that a combination of evolutionary hill climbing with incremental learning and a wellbalanced training set enables first order recurrent networks to reliably learn contextfree and mildly contextsensitive languages. In particular, we trained the networks to predict symbols in string sequences of the contextsensitive language Preprint submitted to Neural Networks 10 January 2003 1}. Comparative experiments with and without incremental learning indicated that incremental learning can accelerate and facilitate training. Furthermore, incrementally trained networks generally resulted in monotonic trajectories in hidden unit activation space, while the trajectories of nonincrementally trained networks were oscillating. The nonincrementally trained networks were more likely to generalise.
Power and Limits of Recurrent Neural Networks for Symbolic Sequences Processing
, 2009
"... A recurrent neural network is a class of neural network where connections between neurons form a directed cycle. These socalled recurrent connections allow spreading information about past neural activities in network, which enables to process temporal inputs. Although they are theoretically equiva ..."
Abstract
 Add to MetaCart
(Show Context)
A recurrent neural network is a class of neural network where connections between neurons form a directed cycle. These socalled recurrent connections allow spreading information about past neural activities in network, which enables to process temporal inputs. Although they are theoretically equivalent to Turing machines, widespread use is restricted due to computational expensive training and lack of knowledge of internal representation mechanism in this class of networks. Our thesis studies properties of recurrent neural networks while processing symbolic inputs. We focused mainly on their relation and description of their behavior in terms of dynamical systems. We describe the dynamics of randomly initialized neural network and its relation to Markov prediction models of variable length. In the main part of our work, we present usability of methods for visualization, clusterization and the state space analysis as an effective tool for thorough study of recurrent networks capabilities on prediction tasks. In experimental part of our thesis, we focus on studying changes that emerge in training. We are mostly interested in the change of naïve Markovian dynamics of randomly initialized network during training in relation to various factors such as input sequence, training algorithm, network architecture, number of hidden units, etc. We focused not only on simple recurrent network before and after training, but also on the computational capabilities of the new approach called echo state networks. It uses large randomly initialized neural reservoir, which dynamics is the subject of our interest. We demonstrate
Incremental Training Of First Order Recurrent Neural Networks To Predict A ContextSensitive Language
"... ..."
(Show Context)
Dynamic Network Functional Comparison via Approximatebisimulation by
"... Abstract: It is generally unknown how to formally determine whether different neural networks have a similar behaviour. This question intimately relates to the problem of finding a suitable similarity measure to identify bounds on the inputoutput response distances of neural networks, which has s ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract: It is generally unknown how to formally determine whether different neural networks have a similar behaviour. This question intimately relates to the problem of finding a suitable similarity measure to identify bounds on the inputoutput response distances of neural networks, which has several interesting theoretical and computational implications. For example, it can allow one to speed up the learning processes by restricting the network parameter space, or to test the robustness of a network with respect to parameter variation. In this paper we develop a procedure that allows to compare neural structures among them. In particular, we consider dynamic networks composed of neural units characterised by nonlinear differential equations, described in terms of autonomous continuous dynamic systems. The comparison is established by importing and adapting from the formal verification setting the concept of δ−approximate bisimulations techniques for nonlinear systems. We have positively tested the proposed approach over continuous time recurrent neural networks (CTRNNs).