Results 1 
3 of
3
Learning stochastic feedforward networks
, 1990
"... Introduction The work reported here began with the desire to find a network architecture that shared with Boltzmann machines [6, 1, 7] the capacity to learn arbitrary probability distributions over binary vectors, but that did not require the negative phase of Boltzmann machine learning. It was hypo ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
Introduction The work reported here began with the desire to find a network architecture that shared with Boltzmann machines [6, 1, 7] the capacity to learn arbitrary probability distributions over binary vectors, but that did not require the negative phase of Boltzmann machine learning. It was hypothesized that eliminating the negative phase would improve learning performance. This goal was achieved by replacing the Boltzmann machine's symmetric connections with feedforward connections. In analogy with Boltzmann machines, the sigmoid function was used to compute the conditional probability of a unit being on from the weighted input from other units. Stochastic simulation of such a network is somewhat more complex than for a Boltzmann machine, but is still possible using local communication. Maximum likelihood, gradientascent learning can be done with a local Hebbtype rule.
Weight Space Probability Densities in Stochastic Learning: I. Dynamics and Equilibria
 Advances in Neural Information Processing Systems
, 1993
"... The ensemble dynamics of stochastic learning algorithms can be studied using theoretical techniques from statistical physics. We develop the equations of motion for the weight space probability densities for stochastic learning algorithms. We discuss equilibria in the diffusion approximation and ..."
Abstract

Cited by 9 (5 self)
 Add to MetaCart
The ensemble dynamics of stochastic learning algorithms can be studied using theoretical techniques from statistical physics. We develop the equations of motion for the weight space probability densities for stochastic learning algorithms. We discuss equilibria in the diffusion approximation and provide expressions for special cases of the LMS algorithm. The equilibrium densities are not in general thermal (Gibbs) distributions in the objective function being minimized, but rather depend upon an effective potential that includes diffusion effects. Finally we present an exact analytical expression for the time evolution of the density for a learning algorithm with weight updates proportional to the sign of the gradient. 1 Introduction: Theoretical Framework Stochastic learning algorithms involve weight updates of the form !(n + 1) = !(n) + (n) H[!(n); x(n) ] (1) where ! 2 RI m is the vector of m weights, is the learning rate, H[\Delta] 2 RI m is the update function, an...
A Stop Criterion for the Boltzmann Machine Learning Algorithm
, 1995
"... : Ackley, Hinton and Sejnowski introduced a very interesting and versatile learning algorithm for the Boltzmann machine (BM). However it is difficult to decide when to stop the learning procedure. Experiments have shown that the BM may destroy previously achieved results when the learning proces ..."
Abstract
 Add to MetaCart
: Ackley, Hinton and Sejnowski introduced a very interesting and versatile learning algorithm for the Boltzmann machine (BM). However it is difficult to decide when to stop the learning procedure. Experiments have shown that the BM may destroy previously achieved results when the learning process is executed for too long. This paper introduces a new quantity, the conditional divergence, measuring the learning success for the inputs of the data set. To demonstrate its use, some experiments are presented, based on the Encoder Problem. 1 Introduction The Boltzmann machine (BM), introduced by Ackley, Hinton and Sejnowski in [Ack 84] is one of the most interesting neural networks. This paper first summarizes the basic concepts of the BM and gives in chapter 2 a short description of the learning algorithm, which was also introduced in [Ack 84]. Chapter 3 analyzes the convergence behavior of the algorithm and introduces a new quantity which makes it possible to decide when to stop th...