Results 1  10
of
239
Nonlinear component analysis as a kernel eigenvalue problem

, 1996
"... We describe a new method for performing a nonlinear form of Principal Component Analysis. By the use of integral operator kernel functions, we can efficiently compute principal components in highdimensional feature spaces, related to input space by some nonlinear map; for instance the space of all ..."
Abstract

Cited by 1048 (72 self)
 Add to MetaCart
We describe a new method for performing a nonlinear form of Principal Component Analysis. By the use of integral operator kernel functions, we can efficiently compute principal components in highdimensional feature spaces, related to input space by some nonlinear map; for instance the space of all possible 5pixel products in 16x16 images. We give the derivation of the method, along with a discussion of other techniques which can be made nonlinear with the kernel approach; and present first experimental results on nonlinear feature extraction for pattern recognition.
The "Independent Components" of Natural Scenes are Edge Filters
, 1997
"... It has previously been suggested that neurons with line and edge selectivities found in primary visual cortex of cats and monkeys form a sparse, distributed representation of natural scenes, and it has been reasoned that such responses should emerge from an unsupervised learning algorithm that attem ..."
Abstract

Cited by 477 (27 self)
 Add to MetaCart
It has previously been suggested that neurons with line and edge selectivities found in primary visual cortex of cats and monkeys form a sparse, distributed representation of natural scenes, and it has been reasoned that such responses should emerge from an unsupervised learning algorithm that attempts to find a factorial code of independent visual features. We show here that a new unsupervised learning algorithm based on information maximization, a nonlinear "infomax" network, when applied to an ensemble of natural scenes produces sets of visual filters that are localized and oriented. Some of these filters are Gaborlike and resemble those produced by the sparsenessmaximization network. In addition, the outputs of these filters are as independent as possible, since this infomax network performs Independent Components Analysis or ICA, for sparse (supergaussian) component distributions. We compare the resulting ICA filters and their associated basis functions, with other decorrelating filters produced by Principal Components Analysis (PCA) and zerophase whitening filters (ZCA). The ICA filters have more sparsely distributed (kurtotic) outputs on natural scenes. They also resemble the receptive fields of simple cells in visual cortex, which suggests that these neurons form a natural, informationtheoretic
Regularization networks and support vector machines
 Advances in Computational Mathematics
, 2000
"... Regularization Networks and Support Vector Machines are techniques for solving certain problems of learning from examples – in particular the regression problem of approximating a multivariate function from sparse data. Radial Basis Functions, for example, are a special case of both regularization a ..."
Abstract

Cited by 266 (33 self)
 Add to MetaCart
Regularization Networks and Support Vector Machines are techniques for solving certain problems of learning from examples – in particular the regression problem of approximating a multivariate function from sparse data. Radial Basis Functions, for example, are a special case of both regularization and Support Vector Machines. We review both formulations in the context of Vapnik’s theory of statistical learning which provides a general foundation for the learning problem, combining functional analysis and statistics. The emphasis is on regression: classification is treated as a special case.
Learning Overcomplete Representations
, 2000
"... In an overcomplete basis, the number of basis vectors is greater than the dimensionality of the input, and the representation of an input is not a unique combination of basis vectors. Overcomplete representations have been advocated because they have greater robustness in the presence of noise, can ..."
Abstract

Cited by 257 (11 self)
 Add to MetaCart
In an overcomplete basis, the number of basis vectors is greater than the dimensionality of the input, and the representation of an input is not a unique combination of basis vectors. Overcomplete representations have been advocated because they have greater robustness in the presence of noise, can be sparser, and can have greater flexibility in matching structure in the data. Overcomplete codes have also been proposed as a model of some of the response properties of neurons in primary visual cortex. Previous work has focused on finding the best representation of a signal using a fixed overcomplete basis (or dictionary). We present an algorithm for learning an overcomplete basis by viewing it as probabilistic model of the observed data. We show that overcomplete bases can yield a better approximation of the underlying statistical distribution of the data and can thus lead to greater coding efficiency. This can be viewed as a generalization of the technique of independent component analysis and provides a method for Bayesian reconstruction of signals in the presence of noise and for blind source separation when there are more sources than mixtures.
Convolutive Blind Separation of NonStationary
"... Acoustic signals recorded simultaneously in a reverberant environment can be described as sums of differently convolved sources. The task of source separation is to identify the multiple channels and possibly to invert those in order to obtain estimates of the underlying sources. We tackle the probl ..."
Abstract

Cited by 129 (3 self)
 Add to MetaCart
Acoustic signals recorded simultaneously in a reverberant environment can be described as sums of differently convolved sources. The task of source separation is to identify the multiple channels and possibly to invert those in order to obtain estimates of the underlying sources. We tackle the problem by explicitly exploiting the nonstationarity of the acoustic sources. Changing crosscorrelations at multiple times give a sufficient set of constraints for the unknown channels. A least squares optimization allows us to estimate a forward model, identifying thus the multipath channel. In the same manner we can find an FIR backward model, which generates well separated model sources. Furthermore, for more than three channels we have sufficient conditions to estimate underlying additive sensor noise powers. We show good performance in real room environments and demonstrate the algorithm's utility for automatic speech recognition.
Learning to Probabilistically Identify Authoritative Documents
 In Proceedings of the 17th International Conference on Machine Learning
, 2000
"... We describe a model of document citation that learns to identify hubs and authorities in a set of linked documents, such as pages retrieved from the world wide web, or papers retrieved from a research paper archive. Unlike the popular HITS algorithm, which relies on dubious statistical assumpt ..."
Abstract

Cited by 125 (2 self)
 Add to MetaCart
We describe a model of document citation that learns to identify hubs and authorities in a set of linked documents, such as pages retrieved from the world wide web, or papers retrieved from a research paper archive. Unlike the popular HITS algorithm, which relies on dubious statistical assumptions, our model provides probabilistic estimates that have clear semantics. We also find that in general, the identified authoritative documents correspond better to human intuition. 1. Introduction Bibliometrics has been described as a "series of techniques that seek to quantify the process of written communication" (Ikpaahindi, 1985). It typically attempts to give quantified answers to questions involving the relationships among documents, or authors and documents: "Who are the most authoritative authors in this field?" "What are the seminal papers?" "How many distinct communities are studying this subject?" and many others (see White & McCain, 1989 for details). Traditionally, the s...
A probabilistic framework for the adaptation and comparison of image codes
 J. Opt. Soc. Am. A
, 1999
"... We apply a Bayesian method for inferring an optimal basis to the problem of finding efficient image codes for natural scenes. The basis functions learned by the algorithm are oriented and localized in both space and frequency, bearing a resemblance to twodimensional Gabor functions, and increasing ..."
Abstract

Cited by 112 (9 self)
 Add to MetaCart
We apply a Bayesian method for inferring an optimal basis to the problem of finding efficient image codes for natural scenes. The basis functions learned by the algorithm are oriented and localized in both space and frequency, bearing a resemblance to twodimensional Gabor functions, and increasing the number of basis functions results in a greater sampling density in position, orientation, and scale. These properties also resemble the spatial receptive fields of neurons in the primary visual cortex of mammals, suggesting that the receptivefield structure of these neurons can be accounted for by a general efficient coding principle. The probabilistic framework provides a method for comparing the coding efficiency of different bases objectively by calculating their probability given the observed data or by measuring the entropy of the basis function coefficients. The learned bases are shown to have better coding efficiency than traditional Fourier and wavelet bases. This framework also provides a Bayesian solution to the problems of image denoising and filling in of missing pixels. We demonstrate that the results obtained by applying the learned bases to these problems are improved over those obtained with traditional techniques. © 1999 Optical Society of America [S07403232(99)031075] OCIS codes: 000.5490, 100.2960, 100.3010.
Independent Component Representations for Face Recognition
"... In a task such as face recognition, much of the important information may be contained in the highorder relationships among the image pixels. A number of face recognition algorithms employ principal component analysis (PCA), which is based on the secondorder statistics of the image set, and does n ..."
Abstract

Cited by 101 (8 self)
 Add to MetaCart
In a task such as face recognition, much of the important information may be contained in the highorder relationships among the image pixels. A number of face recognition algorithms employ principal component analysis (PCA), which is based on the secondorder statistics of the image set, and does not address highorder statistical dependencies such as the relationships among three or more pixels. Independent component analysis (ICA) is a generalization of PCA which separates the highorder moments of the input in addition to the secondorder moments. ICA was performed on a set of face images by an unsupervised learning algorithm derived from the principle of optimal information transfer through sigmoidal neurons. 1 The algorithm maximizes the mutual information between the input and the output, which produces statistically independent outputs under certain conditions. ICA was performed on the face images under two different architectures. The first architecture provided a statistica...
Efficient coding of natural sounds
 Nature Neuroscience
, 2002
"... The auditory system encodes sound by decomposing the amplitude signal arriving at the ear into multiple frequency bands whose center frequencies and bandwidths are approximately logarithmic functions of the distance from the stapes. This particular organization is thought to result from the adaptati ..."
Abstract

Cited by 93 (3 self)
 Add to MetaCart
The auditory system encodes sound by decomposing the amplitude signal arriving at the ear into multiple frequency bands whose center frequencies and bandwidths are approximately logarithmic functions of the distance from the stapes. This particular organization is thought to result from the adaptation of cochlear mechanisms to the statistics of an animal’s auditory environment. Here we report that several basic auditory nerve fiber tuning properties can be accounted for by adapting a population of filter shapes to optimally encode natural sounds. The form of the code is dependent on the class of sounds, resembling a Fourier transformation when optimized for animal vocalizations and a wavelet transformation when optimized for nonbiological environmental sounds. Only for a combined set of vocalizations and environmental sounds does the optimal code follow scaling characteristics that are consistent with physiological data. These results suggest that the population of auditory nerve fibers encode a broad set of natural sounds in a manner that is consistent with information theoretic principles. Correspondence:
Multichannel Blind Deconvolution and Equalization Using the Natural Gradient
 In The First Signal Processing Workshop on Signal Processing Advances in Wireless Communications
, 1997
"... Multichannel deconvolution and equalization is an important task for numerous applications in communications, signal processing, and control. In this paper, we extend the efficient natural gradient search method in [1] to derive a set of online algorithms for combined multichannel blind source separ ..."
Abstract

Cited by 92 (22 self)
 Add to MetaCart
Multichannel deconvolution and equalization is an important task for numerous applications in communications, signal processing, and control. In this paper, we extend the efficient natural gradient search method in [1] to derive a set of online algorithms for combined multichannel blind source separation and timedomain deconvolution/equalization of additive, convolved signal mixtures. Through formal analysis, we prove that the doublyinfinite multichannel equalizer based on the maximum entropy cost function with natural gradient possesses the socalled "equivariance property" such that its asymptotic performance depends on the normalized stochastic distribution of the source signals and not on the mixing characteristics of the unknown channel. We also provide the necessary approximations to enable a computationallysimple finiteimpulseresponse implementation of the naturalgradientbased multichannel deconvolution scheme. Simulations indicate the ability of the algorithm to perform e...