## Analog VLSI Stochastic Perturbative Learning Architectures (1997)

### Cached

### Download Links

Venue: | J. Analog Integrated Circuits and Signal Processing |

Citations: | 16 - 7 self |

### BibTeX

@INPROCEEDINGS{Cauwenberghs97analogvlsi,

author = {Gert Cauwenberghs},

title = {Analog VLSI Stochastic Perturbative Learning Architectures},

booktitle = {J. Analog Integrated Circuits and Signal Processing},

year = {1997},

pages = {195--209}

}

### OpenURL

### Abstract

We present analog VLSI neuromorphic architectures for a general class of learning tasks, which include supervised learning, reinforcement learning, and temporal di erence learning. The presented architectures are parallel, cellular, sparse in global interconnects, distributed in representation, and robust to noise and mismatches in the implementation. They use a parallel stochastic perturbation technique to estimate the e ect of weight changes on network outputs, rather than calculating derivatives based on a model of the network. This \model-free " technique avoids errors due to mismatchesinthephysical implementation of the network, and more generally allows to train networks of which the exact characteristics and structure are not known. With additional mechanisms of reinforcement learning, networks of fairly general structure are trained e ectively from an arbitrarily supplied reward signal. No prior assumptions are required on the structure of the network nor on the speci cs of the desired network response.

### Citations

1328 | Learning to predict by the method of temporal differences
- Sutton
- 1988
(Show Context)
Citation Context ...ct the future reward signal from the present state of the network. We define “reinforcement learning” essentially as given in [11], which includes as special cases “time difference learning” or TD(λ) =-=[12]-=-, and, to some extent, Q-learning [13] and “advanced heuristic dynamic programming” [14]. The equations are listed below in general form to clarify the similarity with the above supervised perturbativ... |

689 |
Analog VLSI and Neural Systems
- Mead
- 1989
(Show Context)
Citation Context ...reby neurophysiological models of perception and information processing in living organisms are mapped onto analog VLSI systems that not only emulate their functions but also resemble their structure =-=[2]-=-. Essential to neuromorphic systems are mechanisms of adaptation and learning, modeled after neural “plasticity” in neurobiology [3], [4]. Learning can be broadly defined as a special case of adaptati... |

567 |
Beyond Regression: New Tools for Prediction and Analysis in Behavioral Sciences
- Werbos
- 1974
(Show Context)
Citation Context ...dularity that apply to stochastic error descent supervised learning apply here as well. As in (4), stochastic approximation estimates of the gradient components in (15) are est ∂E(t) ∂wi = ωi(t) Ê(t) =-=(16)-=- est ∂E(t) ∂vi = υi(t) Ê(t) where the differential perturbed error Ê(t) = 1 (E(w + ω, v + υ, t) 2σ 2 − E(w − ω, v − υ, t)) (17) is obtained from two-sided parallel random perturbation w ± ω simultaneo... |

512 |
Neuronlike adaptive elements that can solve difficult learning control problems
- Barto, Sutton, et al.
(Show Context)
Citation Context ...nd the enviroment inwhich it operates. The stereotypical example of a system able to learn from a discrete delayed reward or punishment signal is the pole-balancer trained with reinforcement learning =-=[18]-=-. We use stochastic perturbative algorithms for model-free estimation of gradient information [16] in a general framework that includes reinforcement learning under delayed and discontinuous rewards [... |

360 |
The Computational Brain
- Churchland, Sejnowski
- 1992
(Show Context)
Citation Context ...nts). Physiological experiments support evidence of local (hebbian [5]) and sparsely globally interconnected (reinforcement [6]) mechanisms of learning and adaptation in biological neural systems [3],=-=[4]-=-. 5 Conclusion Neuromorphic analog VLSI architectures for a general class of learning tasks have been presented, along with key components in their analog VLSI circuit implementation. The architecture... |

288 |
Stochastic Approximation Methods for Constrained and Unconstrained Systems
- Kushner, Clark
- 1978
(Show Context)
Citation Context ...or practical (real-time) evaluation. In such cases, a black-box approach tooptimization is more e ective inevery regard. This motivates the use of the well-known technique of stochastic approximation =-=[24]-=- for blind optimization in analog VLSI systems. We apply this technique to supervised learning as well as to more advanced models of \reinforcement" learning under discrete delayed rewards. The connec... |

164 | Learning to predict by the methods of temporal di!erences - Sutton - 1988 |

123 | Currents carried by sodium and potassium ions through the membrane of the giant axon of Loligo - Hodgkin, Huxley - 1952 |

96 | Neuromorphic electronic systems
- Mead
- 1990
(Show Context)
Citation Context ... the desired network response. Key Words: Neural networks, neuromorphic engineering, reinforcement learning, stochastic approximation 1. Introduction Carver Mead introduced “neuromorphic engineering” =-=[1]-=- as an interdisciplinary approach to the design of biologically inspired neural information processing systems, whereby neurophysiological models of perception and information processing in living org... |

87 |
Bee foraging in uncertain environments using predictive hebbian learning
- Montague, Dayan, et al.
- 1995
(Show Context)
Citation Context ...this type are neuromorphic in the sense that they emulate classical (pavlovian) conditioning in pattern association as found in biological systems [6] and their mathematical and cognitive models [34],=-=[7]-=-. Furthermore, as shown below, the algorithms lend themselves to analog VLSI implementation in a parallel distributed architecture which, unlike more complicated gradient-based schemes, resembles the ... |

67 | CMOS Analog Integrated Circuits Based on Weak Inversion Operation - VITTOZ, MEMBER, et al. - 1977 |

65 | Weight perturbation: An optimal architecture and learning technique for analog VLSI feedforward and recurrent multilayer networks
- Jabri, Flower
- 1992
(Show Context)
Citation Context ...ilar to random-direction nite-di erence gradient descent, have been formulated for blind adaptive control [26], neural networks [16],[27] and the implementation of learning functions in VLSI hardware =-=[28]-=-,[29],[22],[30]. The broader class of neural network learning algorithms under this category exhibit the desirable property that the functional form of the parameter updates is \model-free", i.e., ind... |

63 |
Neural dynamics of attentionally modulated Pavlovian conditioning: Conditioned reinforcement, inhibition, and opponent processing
- Grossberg, Schmajuk
- 1987
(Show Context)
Citation Context ...s of this type are neuromorphic in the sense that they emulate classical (pavlovian) conditioning in pattern association as found in biological systems [6] and their mathematical and cognitive models =-=[34]-=-,[7]. Furthermore, as shown below, the algorithms lend themselves to analog VLSI implementation in a parallel distributed architecture which, unlike more complicated gradient-based schemes, resembles ... |

62 |
A menu of designs for reinforcement learning over time,” in Neural Networks for Control
- Werbos
- 1990
(Show Context)
Citation Context ... use stochastic perturbative algorithms for model-free estimation of gradient information [16] in a general framework that includes reinforcement learning under delayed and discontinuous rewards [17]-=-=[21]-=-, suitable for learning in physical systems of which the characteristics nor the optimization objectives are properly de ned. Stochastic error-descent architectures for supervised learning [22] and co... |

60 | Q-Learning
- Watins, Dayan
- 1992
(Show Context)
Citation Context ... present state of the network. We de ne \reinforcement learning" essentially as given in [18], which includes as special cases \time di erence learning" or TD( ) [19], and, to some extent, Q-learning =-=[20]-=- and \advanced heuristic dynamic programming" [21]. The equations are listed below in general form to clarify the similarity with the above supervised perturbative learning techniques. It will then be... |

47 |
A neural model of attention, reinforcement, and discrimination learning. Int Rev Neurobiol 18:263–327
- Grossberg
- 1975
(Show Context)
Citation Context ...]. We use stochastic perturbative algorithms for model-free estimation of gradient information [16] in a general framework that includes reinforcement learning under delayed and discontinuous rewards =-=[17]-=--[21], suitable for learning in physical systems of which the characteristics nor the optimization objectives are properly de ned. Stochastic error-descent architectures for supervised learning [22] a... |

44 | A model for the neuronal implementation of selective visual attention based on temporal correlation among neurons. J Comput Neurosci 1--2:141--158 - Niebur, Koch - 1994 |

41 | Current-mode subthreshold MOS circuits for analog VLSI neural systems - Andreou, Boahen, et al. - 1991 |

39 |
Experiments in Nonconvex Optimization: Stochastic Approximation and Function Smoothing and Simulated
- Styblinski, ang
- 1990
(Show Context)
Citation Context ...fowitz algorithm for stochastic approximation [24], essentially similar to random-direction nite-di erence gradient descent, have been formulated for blind adaptive control [26], neural networks [16],=-=[27]-=- and the implementation of learning functions in VLSI hardware [28],[29],[22],[30]. The broader class of neural network learning algorithms under this category exhibit the desirable property that the ... |

35 | A fast stochastic error-descent algorithm for supervised learning and optimization
- Cauwenberghs
- 1993
(Show Context)
Citation Context ...s [17]-[21], suitable for learning in physical systems of which the characteristics nor the optimization objectives are properly de ned. Stochastic error-descent architectures for supervised learning =-=[22]-=- and computational primitives of reinforcement learning are combined into an analog VLSI architecture which o ers a modular and cellular structure, model-free distributed representation, and robustnes... |

34 | A single-transistor silicon synapse
- Diorio, Hasler, et al.
- 1996
(Show Context)
Citation Context ...ver atimeinterval exceeding 10 9 refresh cycles (several days) [15]. A non-volatile equivalent of the charge-pump adaptive element in Figure 5, which does not require dynamic refresh, is described in =-=[9]-=-. Correspondingly, a non-volatile learning cell performing stochastic error descent can be obtained by substitution of the core adaptive element in Figure 8 below, and more intricate volatile and non-... |

25 |
The empathetic organization
- Lei, Greer
- 2003
(Show Context)
Citation Context ...shments). Physiological experiments support evidence of local (hebbian [5]) and sparsely globally interconnected (reinforcement [6]) mechanisms of learning and adaptation in biological neural systems =-=[3]-=-,[4]. 5 Conclusion Neuromorphic analog VLSI architectures for a general class of learning tasks have been presented, along with key components in their analog VLSI circuit implementation. The architec... |

24 |
A Parallel Gradient Descent Method for Learning
- Alspector, Meir, et al.
- 1993
(Show Context)
Citation Context ...to random-direction nite-di erence gradient descent, have been formulated for blind adaptive control [26], neural networks [16],[27] and the implementation of learning functions in VLSI hardware [28],=-=[29]-=-,[22],[30]. The broader class of neural network learning algorithms under this category exhibit the desirable property that the functional form of the parameter updates is \model-free", i.e., independ... |

24 | Summed weight neuron perturbation: An o(n) improvement over weight perturbation
- Flower, Jabri
- 1993
(Show Context)
Citation Context ...direction nite-di erence gradient descent, have been formulated for blind adaptive control [26], neural networks [16],[27] and the implementation of learning functions in VLSI hardware [28],[29],[22],=-=[30]-=-. The broader class of neural network learning algorithms under this category exhibit the desirable property that the functional form of the parameter updates is \model-free", i.e., independent of the... |

23 |
A cellular mechanism of classical conditioning in Aplysia: Activity-dependent amplification of presynaptic facilitation
- Hawkins, Abrams, et al.
- 1983
(Show Context)
Citation Context ... of the form q = P vixi [33]. Learning algorithms of this type are neuromorphic in the sense that they emulate classical (pavlovian) conditioning in pattern association as found in biological systems =-=[6]-=- and their mathematical and cognitive models [34],[7]. Furthermore, as shown below, the algorithms lend themselves to analog VLSI implementation in a parallel distributed architecture which, unlike mo... |

23 |
Analog VLSI implementation of neural systems
- Mead, Ismail
- 1989
(Show Context)
Citation Context ...c systems, and fluctuations in the environment in which they operate. Examples of early implementations of analog VLSI neural systems with integrated adaptation and learning functions can be found in =-=[8]-=-. While offchip learning can be effective as long as training is performed with the chip “in the loop”, chip I/O bandwidth limitations make this approach impractical for networks with large number of ... |

21 |
A Stochastic Approximation Technique for Generating Maximum Likelihood Parameter Estimates
- Spall
- 1987
(Show Context)
Citation Context ... Variants on the Kiefer-Wolfowitz algorithm for stochastic approximation [24], essentially similar to random-direction nite-di erence gradient descent, have been formulated for blind adaptive control =-=[26]-=-, neural networks [16],[27] and the implementation of learning functions in VLSI hardware [28],[29],[22],[30]. The broader class of neural network learning algorithms under this category exhibit the d... |

18 |
Model-free distributed learning
- Dembo, Kailath
- 1990
(Show Context)
Citation Context ...discrete delayed reward or punishment signal is the pole-balancer trained with reinforcement learning [18]. We use stochastic perturbative algorithms for model-free estimation of gradient information =-=[16]-=- in a general framework that includes reinforcement learning under delayed and discontinuous rewards [17]-[21], suitable for learning in physical systems of which the characteristics nor the optimizat... |

16 |
An analog VLSI recurrent neural network learning a continuous-time trajectory
- Cauwenberghs
- 1996
(Show Context)
Citation Context ...n of the stochastic error-descent algorithm follows below, as introduced in [22] for e cient supervised learning in analog VLSI. The integrated analog VLSI continoustime learning system used in [31], =-=[32]-=- forms the basis for the architectures outlined in the sections that follow. 2.3 Stochastic Supervised Learning Let E(p) be the error functional to be minimized, with E a scalar deterministic function... |

15 |
Fault-tolerant dynamic multilevel storage in analog VLSI
- Cauwenberghs, Yariv
- 1994
(Show Context)
Citation Context ...4.1.1 Charge-pump adaptive element Figure 5 shows the circuit diagram of a charge-pump adaptive element implemen ting a volatile synapse. The circuit is a simpli ed version of the charge pump used in =-=[14]-=- and [32]. When enabled b y ENn and ENp (at GND and V dd potentials, respectively), the circuit generates an incremental update of which the polarity is determined b y POL. The amplitude of the 13sPOL... |

12 |
A stochastic approximation method
- Robins, Monro
- 1951
(Show Context)
Citation Context ...ic Approximation Techniques Stochastic approximation algorithms [24] have long been known as e ective tools for constrained and unconstrained optimization under noisy observations of system variables =-=[25]-=-. Applied to on-line minimization of an error index E(p), the algorithms avoid the computational burden of gradient estimation by directly observing the dependence of the index E on randomly 3 (3)sapp... |

11 | Analog storage of adjustable synaptic weights - Vittoz, Oguey, et al. - 1991 |

10 |
Oversampling methods for A\/D and D\/A conversion
- Candy
- 1992
(Show Context)
Citation Context ... implemen ting reinforcement learning using stochastic gradient approximation. (a) Reinforcement learning cell. (b) Adaptive critic cell. 10sshaping modulator used for oversampled A/D data conversion =-=[37]-=-. The order-n modulator comprises a cascade of n integrators xi(t) operating on the di erence between the analog input u(t) and the binary modulated output y(t): x 0(t +1) = x 0(t)+a (u(t) ; y(t)) (19... |

9 |
Analog VLSI implementation of gradient descent
- Kirk, Kerns, et al.
- 1993
(Show Context)
Citation Context ...quential activation of complementary perturbations and ; . We note that the synchronous three-phase scheme is not essential and could be replaced by an asynchronous perturbation scheme as in [16] and =-=[42]-=-. While this probably resembles biology more closely, the synchronous gradient estimate (4) using complementary perturbations is computationally more e cient as it cancels error terms up to second ord... |

9 |
Differential conditioning of associative synaptic enhancement in hippocampal brain slices
- Kelso, Brown
- 1986
(Show Context)
Citation Context ...ad categories: unsupervised, supervised and reward/punishment (reinforcement). Physiological experiments have revealed plasticity mechanisms in biology that correpond to Hebbian unsupervised learning =-=[5]-=-, and classical ∗ This work was supported by ARPA/ONR under MURI grant N00014-95-1-0409. Chip fabrication was provided through MOSIS. (pavlovian) conditioning [6], [7] characteristic of reinforcement ... |

8 | A learning analog neural network chip with continuous-time recurrent dynamics
- Cauwenberghs
- 1994
(Show Context)
Citation Context ...ription of the stochastic error-descent algorithm follows below, as introduced in [22] for e cient supervised learning in analog VLSI. The integrated analog VLSI continoustime learning system used in =-=[31]-=-, [32] forms the basis for the architectures outlined in the sections that follow. 2.3 Stochastic Supervised Learning Let E(p) be the error functional to be minimized, with E a scalar deterministic fu... |

7 | Analog memories for VLSI neurocomputing - Horio, Nakamura - 1992 |

7 |
A micropower CMOS algorithmic A/D/A converter
- Cauwenberghs
- 1995
(Show Context)
Citation Context ...ry function of the weight value [15]. As in [15], the binary quantization function can be multiplexed over an array of storage cells, and can be implemented by retaining the LSB from A/D/A conversion =-=[41]-=- of the value to be stored. Experimental observation of quantization and refresh in a fabricated 128-element array of memory cells has con rmed stable retention of analog storage at 8-bit e ective res... |

4 | Adaptive Retina," in Analog VLSI Implementation of Neural Systems - Mead - 1989 |

3 |
The Synaptic Organization of the Brain, 3rd ed
- Shepherd
- 1990
(Show Context)
Citation Context ... not only emulate their functions but also resemble their structure [2]. Essential to neuromorphic systems are mechanisms of adaptation and learning, modeled after neural “plasticity” in neurobiology =-=[3]-=-, [4]. Learning can be broadly defined as a special case of adaptation whereby past experience is used effectively in readjusting the network response to previously unseen, although similar, stimuli. ... |

2 |
Analog VLSI long-term dynamic storage
- Cauwenberghs
- 1996
(Show Context)
Citation Context ...akage between consecutive refresh cycles [14]. Partial incremental refresh can be directly implemented using the adaptive element in Figure 8 by driving POL with a binary function of the weight value =-=[15]-=-. As in [15], the binary quantization function can be multiplexed over an array of storage cells, and can be implemented by retaining the LSB from A/D/A conversion [41] of the value to be stored. Expe... |

1 |
Di erential Conditioning of Associative Synaptic Enhancement in Hippocampal Brain Slices
- Kelso, Brown
- 1986
(Show Context)
Citation Context ... global mechanism that quanti es the \ tness" of the network response in terms of teacher target values or discrete rewards (punishments). Physiological experiments support evidence of local (hebbian =-=[5]-=-) and sparsely globally interconnected (reinforcement [6]) mechanisms of learning and adaptation in biological neural systems [3],[4]. 5 Conclusion Neuromorphic analog VLSI architectures for a general... |

1 | Analysis and Veri cation of an Analog VLSI Outer-Product Incremental Learning System - Cauwenberghs, Neugebauer, et al. - 1992 |

1 |
Mean-Field Theory for Batched-TD( )," submitted to Neural Computation
- Pineda
- 1996
(Show Context)
Citation Context ...: @wi (12) t 0 ;t;1 r(t 0 ) : (13) For =1andy q, the equations reduce to TD( ). Convergence of TD( ) with probability one has been proven in the general case of linear networks of the form q = P vixi =-=[33]-=-. Learning algorithms of this type are neuromorphic in the sense that they emulate classical (pavlovian) conditioning in pattern association as found in biological systems [6] and their mathematical a... |

1 |
Reinforcement Learning in a Nonlinear Noise Shaping Oversampled A/D Converter
- Cauwenberghs
- 1997
(Show Context)
Citation Context ... xi(t +1) = xi(t)+axi;1(t) � i =1� n ; 1 where a =0:5. The control objective istochoose the binary sequence y(t) such astokeep the excursion of the integration variables within bounds, jxi(t)j <x sat =-=[36]-=-. For the adaptive classi er, we specify a one-hidden-layer neural network, with inputs xi(t) and output y(t). The network has m hidden units, a tanh(:) sigmoidal nonlinearity in the hidden layer, and... |

1 |
Mean-field theory for batched-TD(λ),” submitted to Neural Computation
- Pineda
- 1996
(Show Context)
Citation Context ...1 γ t ′ −t−1 r(t ′ ). (13) For γ = 1 and y ≡ q, the equations reduce to TD(λ). Convergence of TD(λ) with probability one has been proven in the general case of linear networks of the form q = ∑ vi xi =-=[26]-=-. Learning algorithms of this type are neuromorphic in the sense that they emulate classical (pavlovian) conditioning in pattern association as found in biological systems [6] and their mathematical a... |