Results 1  10
of
23
Boltzmann machines
, 2007
"... A Boltzmann Machine is a network of symmetrically connected, neuronlike units that make stochastic decisions about whether to be on or off. Boltzmann machines have a simple learning algorithm that allows them to discover interesting features in datasets composed of binary vectors. The learning algor ..."
Abstract

Cited by 220 (21 self)
 Add to MetaCart
A Boltzmann Machine is a network of symmetrically connected, neuronlike units that make stochastic decisions about whether to be on or off. Boltzmann machines have a simple learning algorithm that allows them to discover interesting features in datasets composed of binary vectors. The learning algorithm is very slow in networks with many layers of feature detectors, but it can be made much faster by learning one layer of feature detectors at a time. Boltzmann machines are used to solve two quite different computational problems. For a search problem, the weights on the connections are fixed and are used to represent the cost function of an optimization problem. The stochastic dynamics of a Boltzmann machine then allow it to sample binary state vectors that represent good solutions to the optimization problem. For a learning problem, the Boltzmann machine is shown a set of binary data vectors and it must find weights on the connections so that the data vectors are good solutions to the optimization problem defined by those weights. To solve a learning problem, Boltzmann machines make many small updates to their weights, and each update requires them to solve many different search problems. The stochastic dynamics of a Boltzmann machine When unit i is given the opportunity to update its binary state, it first computes its total input, zi, which is the sum of its own bias, bi, and the weights on connections coming from other active units: zi = bi + �
Gradient calculation for dynamic recurrent neural networks: a survey
 IEEE Transactions on Neural Networks
, 1995
"... Abstract  We survey learning algorithms for recurrent neural networks with hidden units, and put the various techniques into a common framework. We discuss xedpoint learning algorithms, namely recurrent backpropagation and deterministic Boltzmann Machines, and non xedpoint algorithms, namely backp ..."
Abstract

Cited by 180 (3 self)
 Add to MetaCart
(Show Context)
Abstract  We survey learning algorithms for recurrent neural networks with hidden units, and put the various techniques into a common framework. We discuss xedpoint learning algorithms, namely recurrent backpropagation and deterministic Boltzmann Machines, and non xedpoint algorithms, namely backpropagation through time, Elman's history cuto, and Jordan's output feedback architecture. Forward propagation, an online technique that uses adjoint equations, and variations thereof, are also discussed. In many cases, the uni ed presentation leads to generalizations of various sorts. We discuss advantages and disadvantages of temporally continuous neural networks in contrast to clocked ones, continue with some \tricks of the trade" for training, using, and simulating continuous time and recurrent neural networks. We present somesimulations, and at the end, address issues of computational complexity and learning speed.
Modeling Pixel Means and Covariances Using Factorized ThirdOrder Boltzmann Machines
, 2010
"... Learning a generative model of natural images is a useful way of extracting features that capture interesting regularities. Previous work on learning such models has focused on methods in which the latent features are used to determine the mean and variance of each pixel independently, or on methods ..."
Abstract

Cited by 75 (2 self)
 Add to MetaCart
Learning a generative model of natural images is a useful way of extracting features that capture interesting regularities. Previous work on learning such models has focused on methods in which the latent features are used to determine the mean and variance of each pixel independently, or on methods in which the hidden units determine the covariance matrix of a zeromean Gaussian distribution. In this work, we propose a probabilistic model that combines these two approaches into a single framework. We represent each image using one set of binary latent features that model the imagespecific covariance and a separate set that model the mean. We show that this approach provides a probabilistic framework for the widely used simplecell complexcell architecture, it produces very realistic samples of natural images and it extracts features that yield stateoftheart recognition accuracy on the challenging CIFAR 10 dataset.
Learning to Represent Spatial Transformations with Factored HigherOrder Boltzmann Machines
, 2010
"... To allow the hidden units of a restricted Boltzmann machine to model the transformation between two successive images, Memisevic and Hinton (2007) introduced threeway multiplicative interactions that use the intensity of a pixel in the first image as a multiplicative gain on a learned, symmetric we ..."
Abstract

Cited by 73 (18 self)
 Add to MetaCart
To allow the hidden units of a restricted Boltzmann machine to model the transformation between two successive images, Memisevic and Hinton (2007) introduced threeway multiplicative interactions that use the intensity of a pixel in the first image as a multiplicative gain on a learned, symmetric weight between a pixel in the second image and a hidden unit. This creates cubically many parameters, which form a threedimensional interaction tensor. We describe a lowrank approximation to this interaction tensor that uses a sum of factors, each of which is a threeway outer product. This approximation allows efficient learning of transformations between larger image patches. Since each factor can be viewed as an image filter, the model as a whole learns optimal filter pairs for efficiently representing transformations. We demonstrate the learning of optimal filter pairs from various synthetic and real image sequences. We also show how learning about image transformations allows the model to perform a simple visual analogy task, and we show how a completely unsupervised network trained on transformations perceives multiple motions of transparent dot patterns in the same way as humans.
3d object recognition with deep belief nets
 Advances in Neural Information Processing Systems 22
, 2009
"... We introduce a new type of toplevel model for Deep Belief Nets and evaluate it on a 3D object recognition task. The toplevel model is a thirdorder Boltzmann machine, trained using a hybrid algorithm that combines both generative and discriminative gradients. Performance is evaluated on the NORB d ..."
Abstract

Cited by 59 (8 self)
 Add to MetaCart
(Show Context)
We introduce a new type of toplevel model for Deep Belief Nets and evaluate it on a 3D object recognition task. The toplevel model is a thirdorder Boltzmann machine, trained using a hybrid algorithm that combines both generative and discriminative gradients. Performance is evaluated on the NORB database (normalizeduniform version), which contains stereopair images of objects under different lighting conditions and viewpoints. Our model achieves 6.5 % error on the test set, which is close to the best published result for NORB (5.9%) using a convolutional neural net that has builtin knowledge of translation invariance. It substantially outperforms shallow models such as SVMs (11.6%). DBNs are especially suited for semisupervised learning, and to demonstrate this we consider a modified version of the NORB recognition task in which additional unlabeled images are created by applying small translations to the images in the database. With the extra unlabeled data (and the same amount of labeled data as before), our model achieves 5.2 % error. 1
Unsupervised learning of image transformations
 IN COMPUTER VISION AND PATTERN RECOGNITION. IEEE COMPUTER SOCIETY
, 2007
"... We describe a probabilistic model for learning rich, distributed representations of image transformations. The basic model is defined as a gated conditional random field that is trained to predict transformations of its inputs using a factorial set of latent variables. Inference in the model consist ..."
Abstract

Cited by 50 (18 self)
 Add to MetaCart
We describe a probabilistic model for learning rich, distributed representations of image transformations. The basic model is defined as a gated conditional random field that is trained to predict transformations of its inputs using a factorial set of latent variables. Inference in the model consists in extracting the transformation, given a pair of images, and can be performed exactly and efficiently. We show that, when trained on natural videos, the model develops domain specific motion features, in the form of fields of locally transformed edge filters. When trained on affine, or more general, transformations of still images, the model develops codes for these transformations, and can subsequently perform recognition tasks that are invariant under these transformations. It can also fantasize new transformations on previously unseen images. We describe several variations of the basic model and provide experimental results that demonstrate its applicability to a variety of tasks.
Factored 3Way Restricted Boltzmann Machines For Modeling Natural Images
, 2010
"... Deep belief nets have been successful in modeling handwritten characters, but it has proved more difficult to apply them to real images. The problem lies in the restricted Boltzmann machine (RBM) which is used as a module for learning deep belief nets one layer at a time. The GaussianBinary RBMs th ..."
Abstract

Cited by 48 (4 self)
 Add to MetaCart
Deep belief nets have been successful in modeling handwritten characters, but it has proved more difficult to apply them to real images. The problem lies in the restricted Boltzmann machine (RBM) which is used as a module for learning deep belief nets one layer at a time. The GaussianBinary RBMs that have been used to model realvalued data are not a good way to model the covariance structure of natural images. We propose a factored 3way RBM that uses the states of its hidden units to represent abnormalities in the local covariance structure of an image. This provides a probabilistic framework for the widely used simple/complex cell architecture. Our model learns binary features that work very well for object recognition on the “tiny images” data set. Even better features are obtained by then using standard binary RBM’s to learn a deeper model.
On the Complexity of Computing and Learning with Multiplicative Neural Networks
 NEURAL COMPUTATION
"... In a great variety of neuron models neural inputs are combined using the summing operation. We introduce the concept of multiplicative neural networks that contain units which multiply their inputs instead of summing them and, thus, allow inputs to interact nonlinearly. The class of multiplicative n ..."
Abstract

Cited by 37 (3 self)
 Add to MetaCart
In a great variety of neuron models neural inputs are combined using the summing operation. We introduce the concept of multiplicative neural networks that contain units which multiply their inputs instead of summing them and, thus, allow inputs to interact nonlinearly. The class of multiplicative neural networks comprises such widely known and well studied network types as higherorder networks and product unit networks. We investigate the complexity of computing and learning for multiplicative neural networks. In particular, we derive upper and lower bounds on the VapnikChervonenkis (VC) dimension and the pseudo dimension for various types of networks with multiplicative units. As the most general case, we consider feedforward networks consisting of product and sigmoidal units, showing that their pseudo dimension is bounded from above by a polynomial with the same order of magnitude as the currently best known bound for purely sigmoidal networks. Moreover, we show that this bound holds even in the case when the unit type, product or sigmoidal, may be learned. Crucial for these results are calculations of solution set components bounds for new network classes. As to lower bounds we construct product unit networks of fixed depth with superlinear VC dimension. For sigmoidal networks of higher order we establish polynomial bounds that, in contrast to previous results, do not involve any restriction of the network order. We further consider various classes of higherorder units, also known as sigmapi units, that are characterized by connectivity constraints. In terms of these we derive some asymptotically tight bounds.
Discrete restricted Boltzmann machines
, 2015
"... We describe discrete restricted Boltzmann machines: probabilistic graphical models with bipartite interactions between visible and hidden discrete variables. Examples are binary restricted Boltzmann machines and discrete näıve Bayes models. We detail the inference functions and distributed represen ..."
Abstract

Cited by 15 (3 self)
 Add to MetaCart
We describe discrete restricted Boltzmann machines: probabilistic graphical models with bipartite interactions between visible and hidden discrete variables. Examples are binary restricted Boltzmann machines and discrete näıve Bayes models. We detail the inference functions and distributed representations arising in these models in terms of configurations of projected products of simplices and normal fans of products of simplices. We bound the number of hidden variables, depending on the cardinalities of their state spaces, for which these models can approximate any probability distribution on their visible states to any given accuracy. In addition, we use algebraic methods and coding theory to compute their dimension.
A Neural Architecture for a Class of Abduction Problems
 IEEE Transactions on Systems Man and Cybernetics
, 1996
"... The general task of abduction is to infer a hypothesis that best explains a set of data. A typical subtask of this is to synthesize a composite hypothesis that best explains the entire data from elementary hypotheses which can explain portions of it. The synthesis subtask of abduction is computat ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
(Show Context)
The general task of abduction is to infer a hypothesis that best explains a set of data. A typical subtask of this is to synthesize a composite hypothesis that best explains the entire data from elementary hypotheses which can explain portions of it. The synthesis subtask of abduction is computationally expensive, more so in the presence of certain types of interactions between the elementary hypotheses. In this paper, we first formulate the abduction task as a nonmonotonic constrainedoptimization problem. We then consider a special version of the general abduction task that is linear and monotonic. Next, we describe a neural network based on the Hopfield model of computation for the special version of the abduction task. The connections in this network are symmetric, the energy function contains product forms, and the minimization of this function requires a network of order greater than two. We then discuss another neural architecture which is composed of functional module...