• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Advanced Supervised Learning in Multi-layer Perceptrons - From Backpropagation to Adaptive Learning Algorithms (1994)

by Martin Riedmiller
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 52
Next 10 →

Empirical evaluation of the improved Rprop learning algorithms

by Christian Igel, Michael Hüsken , 2003
"... ..."
Abstract - Cited by 43 (17 self) - Add to MetaCart
Abstract not found

Problem Solving With Reinforcement Learning

by Gavin Adrian Rummery , 1995
"... This dissertation is submitted for consideration for the dwree of Doctor' of Philosophy at the Uziver'sity of Cambr'idge Summary This thesis is concerned with practical issues surrounding the application of reinforcement lear'ning techniques to tasks that take place in high dimensional continuous ..."
Abstract - Cited by 42 (0 self) - Add to MetaCart
This dissertation is submitted for consideration for the dwree of Doctor' of Philosophy at the Uziver'sity of Cambr'idge Summary This thesis is concerned with practical issues surrounding the application of reinforcement lear'ning techniques to tasks that take place in high dimensional continuous state-space environments. In particular, the extension of on-line updating methods is considered, where the term implies systems that learn as each experience arrives, rather than storing the experiences for use in a separate off-line learning phase. Firstly, the use of alternative update rules in place of standard Q-learning (Watkins 1989) is examined to provide faster convergence rates. Secondly, the use of multi-layer perceptton (MLP) neural networks (Rumelhart, Hinton and Williams 1986) is investigated to provide suitable generalising function approximators. Finally, consideration is given to the combination of Adaptive Heuristic Critic (AHC) methods and Q-learning to produce systems combining the benefits of real-valued actions and discrete switching

Local Gain Adaptation in Stochastic Gradient Descent

by Nicol N. Schraudolph - In Proc. Intl. Conf. Artificial Neural Networks , 1999
"... Gain adaptation algorithms for neural networks typically adjust learning rates by monitoring the correlation between successive gradients. Here we discuss the limitations of this approach, and develop an alternative by extending Sutton's work on linear systems to the general, nonlinear case. The res ..."
Abstract - Cited by 42 (9 self) - Add to MetaCart
Gain adaptation algorithms for neural networks typically adjust learning rates by monitoring the correlation between successive gradients. Here we discuss the limitations of this approach, and develop an alternative by extending Sutton's work on linear systems to the general, nonlinear case. The resulting online algorithms are computationally little more expensive than other acceleration techniques, do not assume statistical independence between successive training patterns, and do not require an arbitrary smoothing parameter. In our benchmark experiments, they consistently outperform other acceleration methods, and show remarkable robustness when faced with noni. i.d. sampling of the input space.

Improving the Rprop Learning Algorithm

by Christian Igel, Michael Hüsken - PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON NEURAL COMPUTATION (NC 2000) , 2000
"... The Rprop algorithm proposed by Riedmiller and Braun is one of the best performing first-order learning methods for neural networks. We introduce modifications of the algorithm that improve its learning speed. The resulting speedup is experimentally shown for a set of neural network learning tasks a ..."
Abstract - Cited by 35 (7 self) - Add to MetaCart
The Rprop algorithm proposed by Riedmiller and Braun is one of the best performing first-order learning methods for neural networks. We introduce modifications of the algorithm that improve its learning speed. The resulting speedup is experimentally shown for a set of neural network learning tasks as well as for artificial error surfaces.

Rprop - Description and Implementation Details

by Martin Riedmiller, I. Rprop , 1994
"... F31.64> 4 ij (t). This is based on a signdependent adaptation process, similar to the learning-rate adaptation in [4], [5]. 4 (t) ij = 8 ? ? ! ? ? : j + 4 (t\Gamma1) ij ; if @E @w ij (t\Gamma1) @E @w ij (t) ? 0 j \Gamma 4 (t\Gamma1) ij ; if @E @w ij (t\Gamma1) @E @w ij ..."
Abstract - Cited by 21 (0 self) - Add to MetaCart
F31.64> 4 ij (t). This is based on a signdependent adaptation process, similar to the learning-rate adaptation in [4], [5]. 4 (t) ij = 8 ? ? ! ? ? : j + 4 (t\Gamma1) ij ; if @E @w ij (t\Gamma1) @E @w ij (t) ? 0 j \Gamma 4 (t\Gamma1) ij ; if @E @w ij (t\Gamma1) @E @w ij (t) ! 0 4 (t\Gamma1) ij ; else (2) where 0 ! j \Gamma ! 1 ! j + In words, the adaptation-rule works as follows: Every time the partial

S.: Evaluation of policy gradient methods and variants on the cart-pole benchmark

by Martin Riedmiller - In: ADPRL , 2007
"... Abstract — In this paper, we evaluate different versions from the three main kinds of model-free policy gradient methods, i.e., finite difference gradients, ‘vanilla ’ policy gradients and natural policy gradients. Each of these methods is first presented in its simple form and subsequently refined ..."
Abstract - Cited by 13 (2 self) - Add to MetaCart
Abstract — In this paper, we evaluate different versions from the three main kinds of model-free policy gradient methods, i.e., finite difference gradients, ‘vanilla ’ policy gradients and natural policy gradients. Each of these methods is first presented in its simple form and subsequently refined and optimized. By carrying out numerous experiments on the cart pole regulator benchmark we aim to provide a useful baseline for future research on parameterized policy search algorithms. Portable C++ code is provided for both plant and algorithms; thus, the results in this paper can be reevaluated, reused and new algorithms can be inserted with ease. I.

Exploring Constructive Cascade Networks

by N. K. Treadgold, et al. , 1999
"... ..."
Abstract - Cited by 12 (0 self) - Add to MetaCart
Abstract not found

On the Correspondence between Neural Folding Architectures and Tree Automata

by Andreas Küchler, Andreas Kuchler , 1998
"... The folding architecture together with adequate supervised training algorithms is a special recurrent neural network model designed to solve inductive inference tasks on structured domains. Recently, the generic architecture has been proven as a universal approximator of mappings from rooted labeled ..."
Abstract - Cited by 10 (1 self) - Add to MetaCart
The folding architecture together with adequate supervised training algorithms is a special recurrent neural network model designed to solve inductive inference tasks on structured domains. Recently, the generic architecture has been proven as a universal approximator of mappings from rooted labeled ordered trees to real vector spaces. In this article we explore formal correspondences to the automata (language) theory in order to characterize the computational power (representational capabilities) of different instances of the generic folding architecture. As the main result we prove that simple instances of the folding architecture have the computational power of at least the class of deterministic bottom-up tree automata. It is shown how architectural constraints like the number of layers, the type of the activation functions (first-order vs. higher-order) and the transfer functions (threshold vs. sigmoid) influence the representational capabilities. All proofs are carried out in a c...

Application of Sequential Reinforcement Learning to Control Dynamic Systems

by Martin Riedmiller - In IEEE Intenational Conference on Neural Networks (ICNN '96 , 1996
"... The article describes the structure of a neural reinforcement learning controller, based on the approach of asynchronous dynamic programming [BBS93]. The learning controller is applied to a well-known benchmark problem, the cart-pole system. In crucial difference to previous approaches, the goal of ..."
Abstract - Cited by 9 (7 self) - Add to MetaCart
The article describes the structure of a neural reinforcement learning controller, based on the approach of asynchronous dynamic programming [BBS93]. The learning controller is applied to a well-known benchmark problem, the cart-pole system. In crucial difference to previous approaches, the goal of learning is not only to avoid failure, but moreover to stabilize the cart in the middle of the track, with the pole standing in an upright position. The aim is to learn high quality control trajectories known from conventional controller design, by providing only a minimum amount of a priori knowledge and teaching information. 1. Introduction In many tasks to be solved by learning controllers we are faced with the following situation: An unknown system has to be manipulated by an agent or more technically, by a controller, to show a desired behavior. Often, this can only be done by a sequence of control decisions or actions, and the result of the control strategy can only be judged at the ...

Genetic Programming Can Discover Fast and General Learning Rules for Neural Networks

by Amr Radi, Riccardo Poli - in Third Annual Genetic Programming Conference (GP'98 , 1998
"... The Standard BackPropagation (SBP) algorithm for training neural networks suffers from several problems. In this paper, a new technique based upon Genetic Programming (GP) is proposed to overcome some of these problems. We have used GP to discover new supervised learning algorithms. A new learni ..."
Abstract - Cited by 7 (5 self) - Add to MetaCart
The Standard BackPropagation (SBP) algorithm for training neural networks suffers from several problems. In this paper, a new technique based upon Genetic Programming (GP) is proposed to overcome some of these problems. We have used GP to discover new supervised learning algorithms. A new learning algorithms has been discovered and compared with SBP on different problems and has been shown to provide better performances. This study indicates that there exist many supervised learning algorithms better than, but similar to, SBP and that GP can be used to discover them. 1 1 Introduction Supervised learning algorithms are by far the most frequently used methods to train artificial neural networks. The Standard BackPropagation (SBP) algorithm represents a computationally effective method for the training of multilayer networks which has been applied to a number of learning tasks in science, engineering, finance and other disciplines. The SBP learning algorithm has indeed emerged ...
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University