Results 1  10
of
273
An experimental unification of reservoir computing methods
, 2007
"... Three different uses of a recurrent neural network (RNN) as a reservoir that is not trained but instead read out by a simple external classification layer have been described in the literature: Liquid State Machines (LSMs), Echo State Networks (ESNs) and the Backpropagation Decorrelation (BPDC) lea ..."
Abstract

Cited by 70 (10 self)
 Add to MetaCart
Three different uses of a recurrent neural network (RNN) as a reservoir that is not trained but instead read out by a simple external classification layer have been described in the literature: Liquid State Machines (LSMs), Echo State Networks (ESNs) and the Backpropagation Decorrelation (BPDC) learning rule. Individual descriptions of these techniques exist, but a overview is still lacking. Here, we present a series of experimental results that compares all three implementations, and draw conclusions about the relation between a broad range of reservoir parameters and network dynamics, memory, node complexity and performance on a variety of benchmark tests with different characteristics. Next, we introduce a new measure for the reservoir dynamics based on Lyapunov exponents. Unlike previous measures in the literature, this measure is dependent on the dynamics of the reservoir in response to the inputs, and in the cases we tried, it indicates an optimal value for the global scaling of the weight matrix, irrespective of the standard measures. We also describe the Reservoir Computing Toolbox that was used for these experiments, which implements all the types of Reservoir Computing and allows the easy simulation of a wide range of reservoir topologies for a number of benchmarks.
Generating Text with Recurrent Neural Networks
"... Recurrent Neural Networks (RNNs) are very powerful sequence models that do not enjoy widespread use because it is extremely difficult to train them properly. Fortunately, recent advances in Hessianfree optimization have been able to overcome the difficulties associated with training RNNs, making it ..."
Abstract

Cited by 67 (3 self)
 Add to MetaCart
Recurrent Neural Networks (RNNs) are very powerful sequence models that do not enjoy widespread use because it is extremely difficult to train them properly. Fortunately, recent advances in Hessianfree optimization have been able to overcome the difficulties associated with training RNNs, making it possible to apply them successfully to challenging sequence problems. In this paper we demonstrate the power of RNNs trained with the new HessianFree optimizer (HF) by applying them to characterlevel language modeling tasks. The standard RNN architecture, while effective, is not ideally suited for such tasks, so we introduce a new RNN variant that uses multiplicative (or “gated”) connections which allow the current input character to determine the transition matrix from one hidden state vector to the next. After training the multiplicative RNN with the HF optimizer for five days on 8 highend Graphics Processing Units, we were able to surpass the performance of the best previous single method for characterlevel language modeling – a hierarchical nonparametric sequence model. To our knowledge this represents the largest recurrent neural network application to date. 1.
Learning Recurrent Neural Networks with HessianFree Optimization
"... In this work we resolve the longoutstanding problem of how to effectively train recurrent neural networks (RNNs) on complex and difficult sequence modeling problems which may contain longterm data dependencies. Utilizing recent advances in the Hessianfree optimization approach (Martens, 2010), to ..."
Abstract

Cited by 61 (6 self)
 Add to MetaCart
In this work we resolve the longoutstanding problem of how to effectively train recurrent neural networks (RNNs) on complex and difficult sequence modeling problems which may contain longterm data dependencies. Utilizing recent advances in the Hessianfree optimization approach (Martens, 2010), together with a novel damping scheme, we successfully train RNNs on two sets of challenging problems. First, a collection of pathological synthetic datasets which are known to be impossible for standard optimization approaches (due to their extremely longterm dependencies), and second, on three natural and highly complex realworld sequence datasets where we find that our method significantly outperforms the previous stateoftheart method for training neural sequence models: the Long Shortterm Memory approach of Hochreiter and Schmidhuber (1997). Additionally, we offer a new interpretation of the generalized GaussNewton matrix of Schraudolph (2002) which is used within the HF approach of Martens. 1.
Dynamical Movement Primitives: Learning Attractor Models for Motor Behaviors
, 2013
"... Nonlinear dynamical systems have been used in many disciplines to model complex behaviors, including biological motor control, robotics, perception, economics, traffic prediction, and neuroscience. While often the unexpected emergent behavior of nonlinear systems is the focus of investigations, it i ..."
Abstract

Cited by 47 (3 self)
 Add to MetaCart
Nonlinear dynamical systems have been used in many disciplines to model complex behaviors, including biological motor control, robotics, perception, economics, traffic prediction, and neuroscience. While often the unexpected emergent behavior of nonlinear systems is the focus of investigations, it is of equal importance to create goaldirected behavior (e.g., stable locomotion from a system of coupled oscillators under perceptual guidance). Modeling goaldirected behavior with nonlinear systems is, however, rather difficult due to the parameter sensitivity of these systems, their complex phase transitions in response to subtle parameter changes, and the difficulty of analyzing and predicting their longterm behavior; intuition and timeconsuming parameter tuning play a major role. This letter presents and reviews dynamical movement primitives, a line of research for modeling attractor behaviors of autonomous nonlinear dynamical systems with the help of statistical learning techniques. The essence of our approach is to start with a simple dynamical system,
On the importance of initialization and momentum in deep learning
"... Deep and recurrent neural networks (DNNs and RNNs respectively) are powerful models that were considered to be almost impossible to train using stochastic gradient descent with momentum. In this paper, we show that when stochastic gradient descent with momentum uses a welldesigned random initializa ..."
Abstract

Cited by 45 (3 self)
 Add to MetaCart
Deep and recurrent neural networks (DNNs and RNNs respectively) are powerful models that were considered to be almost impossible to train using stochastic gradient descent with momentum. In this paper, we show that when stochastic gradient descent with momentum uses a welldesigned random initialization and a particular type of slowly increasing schedule for the momentum parameter, it can train both DNNs and RNNs (on datasets with longterm dependencies) to levels of performance that were previously achievable only with HessianFree optimization. We find that both the initialization and the momentum are crucial since poorly initialized networks cannot be trained with momentum and wellinitialized networks perform markedly worse when the momentum is absent or poorly tuned. Our success training these models suggests that previous attempts to train deep and recurrent neural networks from random initializations have likely failed due to poor initialization schemes. Furthermore, carefully tuned momentum methods suffice for dealing with the curvature issues in deep and recurrent network training objectives without the need for sophisticated secondorder methods. 1.
Training Recurrent Networks by Evolino
, 2007
"... In recent years, gradientbased LSTM recurrent neural networks (RNNs) solved many previously RNNunlearnable tasks. Sometimes, however, gradient information is of little use for training RNNs, due to numerous local minima. For such cases, we present a novel method: EVOlution of systems with LINear O ..."
Abstract

Cited by 35 (5 self)
 Add to MetaCart
(Show Context)
In recent years, gradientbased LSTM recurrent neural networks (RNNs) solved many previously RNNunlearnable tasks. Sometimes, however, gradient information is of little use for training RNNs, due to numerous local minima. For such cases, we present a novel method: EVOlution of systems with LINear Outputs (Evolino). Evolino evolves weights to the nonlinear, hidden nodes of RNNs while computing optimal linear mappings from hidden state to output, using methods such as pseudoinversebased linear regression. If we instead use quadratic programming to maximize the margin, we obtain the first evolutionary recurrent support vector machines. We show that Evolinobased LSTM can solve tasks that Echo State nets (Jaeger, 2004a) cannot and achieves higher accuracy in certain continuous function generation tasks than conventional gradient descent RNNs, including gradientbased LSTM.
An overview of reservoir computing: theory, applications and implementations
 Proceedings of the 15th European Symposium on Artificial Neural Networks
, 2007
"... Abstract. Training recurrent neural networks is hard. Recently it has however been discovered that it is possible to just construct a random recurrent topology, and only train a single linear readout layer. Stateoftheart performance can easily be achieved with this setup, called Reservoir Computin ..."
Abstract

Cited by 34 (10 self)
 Add to MetaCart
(Show Context)
Abstract. Training recurrent neural networks is hard. Recently it has however been discovered that it is possible to just construct a random recurrent topology, and only train a single linear readout layer. Stateoftheart performance can easily be achieved with this setup, called Reservoir Computing. The idea can even be broadened by stating that any high dimensional, driven dynamic system, operated in the correct dynamic regime can be used as a temporal ‘kernel ’ which makes it possible to solve complex tasks using just linear postprocessing techniques. This tutorial will give an overview of current research on theory, application and implementations of Reservoir Computing. 1
On the difficulty of training recurrent neural networks
"... There are two widely known issues with properly training recurrent neural networks, the vanishing and the exploding gradient problems detailed in Bengio et al. (1994). In this paper we attempt to improve the understanding of the underlying issues by exploring these problems from an analytical, a geo ..."
Abstract

Cited by 31 (5 self)
 Add to MetaCart
(Show Context)
There are two widely known issues with properly training recurrent neural networks, the vanishing and the exploding gradient problems detailed in Bengio et al. (1994). In this paper we attempt to improve the understanding of the underlying issues by exploring these problems from an analytical, a geometric and a dynamical systems perspective. Our analysis is used to justify a simple yet effective solution. We propose a gradient norm clipping strategy to deal with exploding gradients and a soft constraint for the vanishing gradients problem. We validate empirically our hypothesis and proposed solutions in the experimental section. 1.
Error Minimized Extreme Learning Machine With Growth of Hidden Nodes and Incremental Learning
, 2009
"... One of the open problems in neural network research is how to automatically determine network architectures for given applications. In this brief, we propose a simple and efficient approach to automatically determine the number of hidden nodes in generalized singlehiddenlayer feedforward networks ..."
Abstract

Cited by 19 (4 self)
 Add to MetaCart
One of the open problems in neural network research is how to automatically determine network architectures for given applications. In this brief, we propose a simple and efficient approach to automatically determine the number of hidden nodes in generalized singlehiddenlayer feedforward networks (SLFNs) which need not be neural alike. This approach referred to as error minimized extreme learning machine (EMELM) can add random hidden nodes to SLFNs one by one or group by group (with varying group size). During the growth of the networks, the output weights are updated incrementally. The convergence of this approach is proved in this brief as well. Simulation results demonstrate and verify that our new approach is much faster than other sequential/incremental/growing algorithms with good generalization performance.
In search of the neural circuits of intrinsic motivation
 Frontiers in neuroscience
, 2007
"... All intext references underlined in blue are linked to publications on ResearchGate, letting you access and read them immediately. ..."
Abstract

Cited by 19 (8 self)
 Add to MetaCart
(Show Context)
All intext references underlined in blue are linked to publications on ResearchGate, letting you access and read them immediately.