• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

An Empirical Study of Learning Speed in Back-Propagation Networks (1988)

by S E Fahlman
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 156
Next 10 →

A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP Algorithm

by Martin Riedmiller, Heinrich Braun - IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS , 1993
"... A new learning algorithm for multilayer feedforward networks, RPROP, is proposed. To overcome the inherent disadvantages of pure gradient-descent, RPROP performs a local adaptation of the weight-updates according to the behaviour of the errorfunction. In substantial difference to other adaptive tech ..."
Abstract - Cited by 505 (32 self) - Add to MetaCart
A new learning algorithm for multilayer feedforward networks, RPROP, is proposed. To overcome the inherent disadvantages of pure gradient-descent, RPROP performs a local adaptation of the weight-updates according to the behaviour of the errorfunction. In substantial difference to other adaptive techniques, the effect of the RPROP adaptation process is not blurred by the unforseeable influence of the size of the derivative but only dependent on the temporal behaviour of its sign. This leads to an efficient and transparent adaptation process. The promising capabilities of RPROP are shown in comparison to other wellknown adaptive techniques.

First and Second-Order Methods for Learning: between Steepest Descent and Newton's Method

by Roberto Battiti - Neural Computation , 1992
"... On-line first order backpropagation is sufficiently fast and effective for many large-scale classification problems but for very high precision mappings, batch processing may be the method of choice. This paper reviews first- and second-order optimization methods for learning in feedforward neura ..."
Abstract - Cited by 108 (6 self) - Add to MetaCart
On-line first order backpropagation is sufficiently fast and effective for many large-scale classification problems but for very high precision mappings, batch processing may be the method of choice. This paper reviews first- and second-order optimization methods for learning in feedforward neural networks. The viewpoint is that of optimization: many methods can be cast in the language of optimization techniques, allowing the transfer to neural nets of detailed results about computational complexity and safety procedures to ensure convergence and to avoid numerical problems. The review is not intended to deliver detailed prescriptions for the most appropriate methods in specific applications, but to illustrate the main characteristics of the different methods and their mutual relations.

Design and Analysis of a Computational Model of Cooperative Coevolution

by Mitchell A. Potter , 1997
"... ..."
Abstract - Cited by 78 (2 self) - Add to MetaCart
Abstract not found

Symbiotic Evolution of Neural Networks in Sequential Decision Tasks

by David Eric Moriarty , 1997
"... ..."
Abstract - Cited by 58 (5 self) - Add to MetaCart
Abstract not found

Support vector machines for speech recognition

by Aravind Ganapathiraju, Jonathan Hamaker, Joseph Picone - Proceedings of the International Conference on Spoken Language Processing , 1998
"... Statistical techniques based on hidden Markov Models (HMMs) with Gaussian emission densities have dominated signal processing and pattern recognition literature for the past 20 years. However, HMMs trained using maximum likelihood techniques suffer from an inability to learn discriminative informati ..."
Abstract - Cited by 47 (2 self) - Add to MetaCart
Statistical techniques based on hidden Markov Models (HMMs) with Gaussian emission densities have dominated signal processing and pattern recognition literature for the past 20 years. However, HMMs trained using maximum likelihood techniques suffer from an inability to learn discriminative information and are prone to overfitting and over-parameterization. Recent work in machine learning has focused on models, such as the support vector machine (SVM), that automatically control generalization and parameterization as part of the overall optimization process. In this paper, we show that SVMs provide a significant improvement in performance on a static pattern classification task based on the Deterding vowel data. We also describe an application of SVMs to large vocabulary speech recognition, and demonstrate an improvement in error rate on a continuous alphadigit task (OGI Aphadigits) and a large vocabulary conversational speech task (Switchboard). Issues related to the development and optimization of an SVM/HMM hybrid system are discussed.

The Acquisition of Lexical Semantics for Spatial Terms: A Connectionist Model of Perceptual Categories

by Terrance Philip Regier , 1992
"... This thesis describes a connectionist model which learns to perceive spatial events and relations in simple movies of 2-dimensional objects, so as to name the events and relations as a speaker of a particular natural language would. Thus, the model learns perceptually grounded semantics for natura ..."
Abstract - Cited by 40 (2 self) - Add to MetaCart
This thesis describes a connectionist model which learns to perceive spatial events and relations in simple movies of 2-dimensional objects, so as to name the events and relations as a speaker of a particular natural language would. Thus, the model learns perceptually grounded semantics for natural language spatial terms. Natural languages differ -- sometimes dramatically -- in the ways in which they structure space. The aim here has been to have the model be able to perform this learning task for terms from any natural language, and to have learning take place in the absence of explicit negative evidence, in order to rule out ad hoc solutions and to approximate the conditions under which children learn. The central focus of this thesis is a...

Comparison of Optimized Backpropagation Algorithms

by W. Schiffmann, M. Joost, R. Werner - Proc. of ESANN'93, Brussels , 1993
"... Backpropagation is one of the most famous training algorithms for multilayer perceptrons. Unfortunately it can be very slow for practical applications. Over the last years many improvement strategies have been developed to speed up backpropagation. It's very difficult to compare these different tech ..."
Abstract - Cited by 36 (1 self) - Add to MetaCart
Backpropagation is one of the most famous training algorithms for multilayer perceptrons. Unfortunately it can be very slow for practical applications. Over the last years many improvement strategies have been developed to speed up backpropagation. It's very difficult to compare these different techniques, because most of them have been tested on various specific data sets. Most of the reported results are based on some kind of tiny and artificial training sets like XOR, encoder or decoder. It's very doubtful if these results hold for more complicate practical application. In this report an overview of many different speedup techniques is given. All of them were assessed by a very hard practical classification task, which consists of a big medical data set. As you will see many of these optimized algorithms fail in learning the data set. 1 Introduction This report is intended to summarize our experience using many different speedup techniques for the backpropagation algorithm. We have...

Modular Neural Networks for Learning Context-Dependent Game Strategies

by Justin A. Boyan - Master’s thesis, Computer Speech and Language Processing , 1992
"... The method of temporal differences (TD) is a learning technique which specialises in predicting the likely outcome of a sequence over time. Examples of such sequences include speech frame vectors, whose outcome is a phoneme or word decision, and positions in a board game, whose outcome is a win/loss ..."
Abstract - Cited by 31 (3 self) - Add to MetaCart
The method of temporal differences (TD) is a learning technique which specialises in predicting the likely outcome of a sequence over time. Examples of such sequences include speech frame vectors, whose outcome is a phoneme or word decision, and positions in a board game, whose outcome is a win/loss decision. Recent results by Tesauro in the domain of backgammon indicate that a neural network, trained by TD methods to evaluate positions generated by self-play, can reach an advanced level of backgammon skill. For my summer thesis project, I first implemented the TD/neural network learning algorithms and confirmed Tesauro's results, using the domains of tic-tac-toe and backgammon. Then, motivated by Waibel's success with modular neural networks for phoneme recognition, I experimented with using two modular architectures (DDD and Meta-Pi) in place of the monolithic networks. I found that using the modular networks significantly enhanced the ability of the backgammon evaluator to change it...

The theory of segmental hidden Markov models

by M. J. F. Gales, S. J. Young , 1993
"... c ..."
Abstract - Cited by 30 (0 self) - Add to MetaCart
Abstract not found

VERBMOBIL: The Use of Prosody in the Linguistic Components of a Speech Understanding System

by Elmar Nöth, Anton Batliner, Andreas Kießling, Ralf Kompe, Heinrich Niemann , 2000
"... In this paper, we show how prosody can be used in speech understanding systems. This is demonstrated with the VERBMOBIL speech-to-speech translation system which, to our knowledge, is the first complete system which successfully uses prosodic information in the linguistic analysis. Prosody is used b ..."
Abstract - Cited by 25 (5 self) - Add to MetaCart
In this paper, we show how prosody can be used in speech understanding systems. This is demonstrated with the VERBMOBIL speech-to-speech translation system which, to our knowledge, is the first complete system which successfully uses prosodic information in the linguistic analysis. Prosody is used by computing probabilities for clause boundaries, accentuation, and different types of sentence mood for each of the word hypotheses computed by the word recognizer. These probabilities guide the search of the linguistic analysis. Disambiguation is already achieved during the analysis and not by a prosodic verification of different linguistic hypotheses. So far, the most useful prosodic information is provided by clause boundaries. These are detected with a recognition rate of 94%. For the parsing of word hypotheses graphs, the use of clause boundary probabilities yields a speed-up of 92% and a 96% reduction of alternative readings.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University