Department of Computer Science; University of Toronto
SVM HeaderParse 0.2
Introduction The work reported here began with the desire to find a network architecture that shared with Boltzmann machines [6, 1, 7] the capacity to learn arbitrary probability distributions over binary vectors, but that did not require the negative phase of Boltzmann machine learning. It was hypothesized that eliminating the negative phase would improve learning performance. This goal was achieved by replacing the Boltzmann machine's symmetric connections with feedforward connections. In analogy with Boltzmann machines, the sigmoid function was used to compute the conditional probability of a unit being on from the weighted input from other units. Stochastic simulation of such a network is somewhat more complex than for a Boltzmann machine, but is still possible using local communication. Maximum likelihood, gradient-ascent learning can be done with a local Hebb-type rule.