Results 1  10
of
16
A Fast Stochastic ErrorDescent Algorithm for Supervised Learning and Optimization
 In
, 1993
"... A parallel stochastic algorithm is investigated for errordescent learning and optimization in deterministic networks of arbitrary topology. No explicit information about internal network structure is needed. The method is based on the modelfree distributed learning mechanism of Dembo and Kaila ..."
Abstract

Cited by 35 (7 self)
 Add to MetaCart
A parallel stochastic algorithm is investigated for errordescent learning and optimization in deterministic networks of arbitrary topology. No explicit information about internal network structure is needed. The method is based on the modelfree distributed learning mechanism of Dembo and Kailath. A modified parameter update rule is proposed by which each individual parameter vector perturbation contributes a decrease in error. A substantially faster learning speed is hence allowed. Furthermore, the modified algorithm supports learning timevarying features in dynamical networks. We analyze the convergence and scaling properties of the algorithm, and present simulation results for dynamic trajectory learning in recurrent networks. 1 Background and Motivation We address general optimization tasks that require finding a set of constant parameter values p i that minimize a given error functional E(p). For supervised learning, the error functional consists of some quantitativ...
Alopex: a correlationbased learning algorithm for feedforward and recurrent neural networks
 Neural Computation
, 1994
"... We present a learning algorithm for neural networks, called Alopex. Instead of error gradient, Alopex uses local correlations between changes in individual weights and changes in the global error measure. The algorithm does not make any assumptions about transfer functions of individual neurons, an ..."
Abstract

Cited by 24 (1 self)
 Add to MetaCart
We present a learning algorithm for neural networks, called Alopex. Instead of error gradient, Alopex uses local correlations between changes in individual weights and changes in the global error measure. The algorithm does not make any assumptions about transfer functions of individual neurons, and does not explicitly depend on the functional form of the error measure. Hence, it can be used in networks with arbitrary transfer functions and for minimizing a large class of error measures. The learning algorithm is the same for feedforward and recurrent networks. All the weights in a network are updated simultaneously, using only local computations. This allows complete parallelization of the algorithm. The algorithm is stochastic and it uses a ‘temperature ’ parameter in a manner similar to that in simulated annealing. A heuristic ‘ annealing schedule ’ is presented which is effective in finding global minima of error surfaces. In this paper, we report extensive simulation studies illustrating these advantages and show that learning times are comparable to those for standard gradient descent methods. Feedforward networks trained with Alopex are used to solve the MONK’s problems and symmetry problems. Recurrent networks trained with the same algorithm are used for solving temporal XOR problems. Scaling properties of the algorithm are demonstrated using encoder problems of different sizes and advantages of appropriate error measures are illustrated using a variety of problems.
An Analysis of Noise in Recurrent Neural Networks: Convergence and Generalization
 IEEE Transactions on Neural Networks
, 1996
"... There has been much interest in applying noise to feedforward neural networks in order to observe their effect on network performance. We extend these results by introducing and analyzing various methods of injecting synaptic noise into dynamicallydriven recurrent networks during training. We prese ..."
Abstract

Cited by 19 (0 self)
 Add to MetaCart
There has been much interest in applying noise to feedforward neural networks in order to observe their effect on network performance. We extend these results by introducing and analyzing various methods of injecting synaptic noise into dynamicallydriven recurrent networks during training. We present theoretical results which show that applying a controlled amount of noise during training may improve convergence and generalization performance. In addition, we analyze the effects of various noise parameters (additive vs. multiplicative, cumulative vs. noncumulative, per time step vs. per string) and predict that best overall performance can be achieved by injecting additive noise at each time step. Noise contributes a secondorder gradient term to the error function which can be viewed as an anticipatory agent to aid convergence. This term appears to find promising regions of weight space in the beginning stages of training when the training error is large and should improve convergen...
Analog VLSI Stochastic Perturbative Learning Architectures
 J. Analog Integrated Circuits and Signal Processing
, 1997
"... We present analog VLSI neuromorphic architectures for a general class of learning tasks, which include supervised learning, reinforcement learning, and temporal di erence learning. The presented architectures are parallel, cellular, sparse in global interconnects, distributed in representation, and ..."
Abstract

Cited by 16 (7 self)
 Add to MetaCart
We present analog VLSI neuromorphic architectures for a general class of learning tasks, which include supervised learning, reinforcement learning, and temporal di erence learning. The presented architectures are parallel, cellular, sparse in global interconnects, distributed in representation, and robust to noise and mismatches in the implementation. They use a parallel stochastic perturbation technique to estimate the e ect of weight changes on network outputs, rather than calculating derivatives based on a model of the network. This \modelfree " technique avoids errors due to mismatchesinthephysical implementation of the network, and more generally allows to train networks of which the exact characteristics and structure are not known. With additional mechanisms of reinforcement learning, networks of fairly general structure are trained e ectively from an arbitrarily supplied reward signal. No prior assumptions are required on the structure of the network nor on the speci cs of the desired network response.
Multichannel coherent detection for delayinsensitive modelfree adaptive control
 in Proc. Int. Symp. Circuits and Systems (ISCAS ’07
, 2007
"... Abstract — A mixedsignal architecture for continuoustime multidimensional modelfree optimization is presented. It is based on multichannel coherent modulation and detection that reliably estimates the objective function’s gradient, with respect to the system parameters, in the presence of time d ..."
Abstract

Cited by 5 (5 self)
 Add to MetaCart
Abstract — A mixedsignal architecture for continuoustime multidimensional modelfree optimization is presented. It is based on multichannel coherent modulation and detection that reliably estimates the objective function’s gradient, with respect to the system parameters, in the presence of time delays. The narrowband nature of the excitation signals reduces the unknown dynamics of the objective function to a single parameter per control channel, the phase delay. An efficient implementation of the adaptive control architecture is presented; it incorporates parallel control channels with individually selectable 6level phase delay adjustment. Initial experimental results indicate wide operating range covering almost 7 decades of excitation frequencies. I.
Model of birdsong learning based on gradient estimation by dynamic perturbation of neural conductances
, 2007
"... ..."
Image sharpness and beam focus vlsi sensors for adaptive optics
 IEEE Sensors Journal
, 2002
"... Abstract—Highresolution wavefront control for adaptive optics requires accurate sensing of a measure of optical quality. We present two analog verylargescaleintegration (VLSI) imageplane sensors that supply realtime metrics of image and beam quality, for applications in imaging and lineofsig ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
Abstract—Highresolution wavefront control for adaptive optics requires accurate sensing of a measure of optical quality. We present two analog verylargescaleintegration (VLSI) imageplane sensors that supply realtime metrics of image and beam quality, for applications in imaging and lineofsight laser communication. The image metric VLSI sensor quantifies sharpness of the received image in terms of average rectified spatial gradients. The beam metric VLSI sensor returns first and second order spatial moments of the received laser beam to quantify centroid and width. Closedloop wavefront control of a laser beam through turbulence is demonstrated using a spatial phase modulator and analog VLSI controller that performs stochastic parallel gradient descent of the beam width metric. Index Terms—Adaptive optics, analog very large scale integration (VLSI), focalplane image processing, image sensors, optical communication. I.
Accurate and Precise Computation using Analog VLSI, with Applications to Computer Graphics and Neural Networks
, 1993
"... This thesis develops an engineering practice and design methodology to enable us to use CMOS analog VLSI chips to perform more accurate and precise computation. These techniques form the basis of an approach that permits us to build computer graphics and neural network applications using analog VLSI ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
This thesis develops an engineering practice and design methodology to enable us to use CMOS analog VLSI chips to perform more accurate and precise computation. These techniques form the basis of an approach that permits us to build computer graphics and neural network applications using analog VLSI. The nature of the design methodology focuses on defining goals for circuit behavior to be met as part of the design process. To increase the accuracy of analog computation, we develop techniques for creating compensated circuit building blocks, where compensation implies the cancellation of device variations, offsets, and nonlinearities. These compensated building blocks can be used as components in larger and more complex circuits, which can then also be compensated. To this end, we develop techniques for automatically determining appropriate parameters for circuits, using constrained optimization. We also fabricate circuits that implement multidimensional gradient estimation for a grad...
HighSpeed, ModelFree Adaptive Control Using Parallel Synchronous Detection ABSTRACT
"... A VLSI implementation of an adaptive controller performing gradient descent optimization of external performance metrics using parallel synchronous detection is presented. Realtime modelfree gradient estimation is done by perturbation of the metrics ’ control parameters with narrowband determinis ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
A VLSI implementation of an adaptive controller performing gradient descent optimization of external performance metrics using parallel synchronous detection is presented. Realtime modelfree gradient estimation is done by perturbation of the metrics ’ control parameters with narrowband deterministic dithers resulting in fast adaptation and robust performance. A fully translinear design has been employed for the architecture, making the controller operation scalable within a very wide range of frequencies and control bandwidths, and, therefore customizable for a variety of systems and applications. Experimental results from a SiGe BiCMOS implementation are provided demonstrating the broadband and highspeed performance of the controller.
Synaptic noise as a means of implementing weightperturbation learning
 Connection Science
, 2006
"... Weightperturbation (WP) algorithms for supervised and/or reinforcement learning offer improved biological plausibility over backpropagation because of their reduced circuitry requirements for realization in neural hardware. All such algorithms use some form of information source — a means to compar ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Weightperturbation (WP) algorithms for supervised and/or reinforcement learning offer improved biological plausibility over backpropagation because of their reduced circuitry requirements for realization in neural hardware. All such algorithms use some form of information source — a means to compare weight changes with changes in output error — to adjust weights. This paper explores the hypothesis that biological synaptic noise might serve as the substrate by which weight perturbation is implemented. We explore the basic synaptic noise hypothesis (BSNH) which embodies the weakest assumptions about the underlying neural circuitry required to implement WP algorithms. The present paper identifies relevant biological constraints consistent with the BSNH, taxonomizes existing WP algorithms in regard to consistency with those constraints, and proposes a new WP algorithm that is fully consistent with the constraints. By comparing the learning effectiveness of these algorithms via simulation studies, we find that all of the algorithms can support traditional neural network learning tasks and have similar generalization characteristics, although the results suggest a tradeoff between learning efficiency and biological accuracy. This establishes the basic result that biological synaptic noise, coupled with appropriate reward, can be exploited to implement WP algorithms for neural network learning. 1