Results 1  10
of
285
A fast learning algorithm for deep belief nets
 Neural Computation
, 2006
"... We show how to use “complementary priors ” to eliminate the explaining away effects that make inference difficult in denselyconnected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a ..."
Abstract

Cited by 930 (51 self)
 Add to MetaCart
(Show Context)
We show how to use “complementary priors ” to eliminate the explaining away effects that make inference difficult in denselyconnected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. The fast, greedy algorithm is used to initialize a slower learning procedure that finetunes the weights using a contrastive version of the wakesleep algorithm. After finetuning, a network with three hidden layers forms a very good generative model of the joint distribution of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning algorithms. The lowdimensional manifolds on which the digits lie are modelled by long ravines in the freeenergy landscape of the toplevel associative memory and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind. 1
A Framework for Robust Subspace Learning
 International Journal of Computer Vision
, 2003
"... Many computer vision, signal processing and statistical problems can be posed as problems of learning low dimensional linear or multilinear models. These models have been widely used for the representation of shape, appearance, motion, etc, in computer vision applications. ..."
Abstract

Cited by 175 (10 self)
 Add to MetaCart
(Show Context)
Many computer vision, signal processing and statistical problems can be posed as problems of learning low dimensional linear or multilinear models. These models have been widely used for the representation of shape, appearance, motion, etc, in computer vision applications.
An Active Vision Architecture based on Iconic Representations
 Artificial Intelligence
, 1995
"... Active vision systems have the capability of continuously interacting with the environment. The rapidly changing environment of such systems means that it is attractive to replace static representations with visual routines that compute information on demand. Such routines place a premium on image d ..."
Abstract

Cited by 143 (13 self)
 Add to MetaCart
(Show Context)
Active vision systems have the capability of continuously interacting with the environment. The rapidly changing environment of such systems means that it is attractive to replace static representations with visual routines that compute information on demand. Such routines place a premium on image data structures that are easily computed and used. The purpose of this paper is to propose a general active vision architecture based on efficiently computable iconic representations. This architecture employs two primary visual routines, one for identifying the visual image near the fovea (object identification), and another for locating a stored prototype on the retina (object location). This design allows complex visual behaviors to be obtained by composing these two routines with different parameters. The iconic representations are comprised of highdimensional feature vectors obtained from the responses of an ensemble of Gaussian derivative spatial filters at a number of orientations and...
Robust Principal Component Analysis for Computer Vision
, 2001
"... Principal Component Analysis (PCA) has been widely used for the representation of shape, appearance, and motion. One drawback of typical PCA methods is that they are least squares estimation techniques and hence fail to account for "outliers" which are common in realistic training sets. In ..."
Abstract

Cited by 133 (3 self)
 Add to MetaCart
Principal Component Analysis (PCA) has been widely used for the representation of shape, appearance, and motion. One drawback of typical PCA methods is that they are least squares estimation techniques and hence fail to account for "outliers" which are common in realistic training sets. In computer vision applications, outliers typically occur within a sample (image) due to pixels that are corrupted by noise, alignment errors, or occlusion. We review previous approaches for making PCA robust to outliers and present a new method that uses an intrasample outlier process to account for pixel outliers. We develop the theory of Robust Principal Component Analysis (RPCA) and describe a robust Mestimation algorithm for learning linear multivariate representations of high dimensional data such as images. Quantitative comparisons with traditional PCA and previous robust algorithms illustrate the benefits of RPCA when outliers are present. Details of the algorithm are described and a software implementation is being made publically available.
Neural networks for classification: a survey
 and Cybernetics  Part C: Applications and Reviews
, 2000
"... Abstract—Classification is one of the most active research and application areas of neural networks. The literature is vast and growing. This paper summarizes the some of the most important developments in neural network classification research. Specifically, the issues of posterior probability esti ..."
Abstract

Cited by 132 (0 self)
 Add to MetaCart
(Show Context)
Abstract—Classification is one of the most active research and application areas of neural networks. The literature is vast and growing. This paper summarizes the some of the most important developments in neural network classification research. Specifically, the issues of posterior probability estimation, the link between neural and conventional classifiers, learning and generalization tradeoff in classification, the feature variable selection, as well as the effect of misclassification costs are examined. Our purpose is to provide a synthesis of the published research in this area and stimulate further research interests and efforts in the identified topics. Index Terms—Bayesian classifier, classification, ensemble methods, feature variable selection, learning and generalization, misclassification costs, neural networks. I.
Complementary roles of the basal ganglia and cerebellum in learning and motor control. Current Opinion in Neurobiology
, 2000
"... motor control ..."
Unsupervised Learning of Distributions on Binary Vectors Using Two Layer Networks
, 1994
"... this paper is related to both of these lines of work and has some advantages over each of them. If we find a good model of the distribution, we can tackle other interesting learning problems, such as the problem of estimating the conditional distribution on certain components of the vector ~x when p ..."
Abstract

Cited by 95 (1 self)
 Add to MetaCart
this paper is related to both of these lines of work and has some advantages over each of them. If we find a good model of the distribution, we can tackle other interesting learning problems, such as the problem of estimating the conditional distribution on certain components of the vector ~x when provided with the values for the other components (a kind of regression problem), or predicting the actual values for certain components of ~x based on the values of the other components (a kind of pattern completion task). In the example of the binary images presented above, this would amount to the task of recovering the value of a pixel whose value has been corrupted. We can often also use the distribution model to help us in a supervised learning task. This is because it is often easier to express the mapping of an instance to the correct label by using "features" that are correlation patterns among the bits of the instance. For example, it is easier to describe each of the ten digits in terms of patterns such as lines and circles, rather than in terms of the values of individual pixels, that are more likely to change between different instances of the same digit. The process of learning an unknown distribution from examples is usually called density estimation or
Candid covariancefree incremental principal component analysis
 IEEE Trans. Pattern Analysis and Machine Intelligence
, 2003
"... Abstract—Appearancebased image analysis techniques require fast computation of principal components of highdimensional image vectors. We introduce a fast incremental principal component analysis (IPCA) algorithm, called candid covariancefree IPCA (CCIPCA), used to compute the principal components ..."
Abstract

Cited by 81 (9 self)
 Add to MetaCart
(Show Context)
Abstract—Appearancebased image analysis techniques require fast computation of principal components of highdimensional image vectors. We introduce a fast incremental principal component analysis (IPCA) algorithm, called candid covariancefree IPCA (CCIPCA), used to compute the principal components of a sequence of samples incrementally without estimating the covariance matrix (so covariancefree). The new method is motivated by the concept of statistical efficiency (the estimate has the smallest variance given the observed data). To do this, it keeps the scale of observations and computes the mean of observations incrementally, which is an efficient estimate for some wellknown distributions (e.g., Gaussian), although the highest possible efficiency is not guaranteed in our case because of unknown sample distribution. The method is for realtime applications and, thus, it does not allow iterations. It converges very fast for highdimensional image vectors. Some links between IPCA and the development of the cerebral cortex are also discussed. Index Terms—Principal component analysis, incremental principal component analysis, stochastic gradient ascent (SGA), generalized hebbian algorithm (GHA), orthogonal complement. æ 1
The Principal Components of Natural Images
, 1991
"... A neural net was used to analyse samples of natural images and text. For the natural images, components resemble derivatives of Gaussian operators, similar to those found in visual cortex and inferred from psychophysics [4]. While the results from natural images do not depend on scale, those from te ..."
Abstract

Cited by 80 (2 self)
 Add to MetaCart
A neural net was used to analyse samples of natural images and text. For the natural images, components resemble derivatives of Gaussian operators, similar to those found in visual cortex and inferred from psychophysics [4]. While the results from natural images do not depend on scale, those from text images are highly scale dependent. Convolution of one of the text components with an original image shows that it is sensitive to interword gaps. 1 Introduction We live in, and are required to make sense of, a complex visual world. One key to interpreting images is to know something of their statistics. The simplest kind of statistics, based on pixel grey levels, are first order: means, variances and probability distributions of brightness values. Such statistics are very useful, for instance, to set thresholds. We can also ask more complicated questions, such as: how does the value of one pixel depend on that of its neighbours? In images of the real world, nearby pixels will often have...