Results 1  10
of
27
The "Independent Components" of Natural Scenes are Edge Filters
, 1997
"... It has previously been suggested that neurons with line and edge selectivities found in primary visual cortex of cats and monkeys form a sparse, distributed representation of natural scenes, and it has been reasoned that such responses should emerge from an unsupervised learning algorithm that attem ..."
Abstract

Cited by 517 (27 self)
 Add to MetaCart
It has previously been suggested that neurons with line and edge selectivities found in primary visual cortex of cats and monkeys form a sparse, distributed representation of natural scenes, and it has been reasoned that such responses should emerge from an unsupervised learning algorithm that attempts to find a factorial code of independent visual features. We show here that a new unsupervised learning algorithm based on information maximization, a nonlinear "infomax" network, when applied to an ensemble of natural scenes produces sets of visual filters that are localized and oriented. Some of these filters are Gaborlike and resemble those produced by the sparsenessmaximization network. In addition, the outputs of these filters are as independent as possible, since this infomax network performs Independent Components Analysis or ICA, for sparse (supergaussian) component distributions. We compare the resulting ICA filters and their associated basis functions, with other decorrelating filters produced by Principal Components Analysis (PCA) and zerophase whitening filters (ZCA). The ICA filters have more sparsely distributed (kurtotic) outputs on natural scenes. They also resemble the receptive fields of simple cells in visual cortex, which suggests that these neurons form a natural, informationtheoretic
Non Linear Neurons in the Low Noise Limit: A Factorial Code Maximizes Information Transfer
, 1994
"... We investigate the consequences of maximizing information transfer in a simple neural network (one input layer, one output layer), focussing on the case of non linear transfer functions. We assume that both receptive fields (synaptic efficacies) and transfer functions can be adapted to the environm ..."
Abstract

Cited by 148 (18 self)
 Add to MetaCart
We investigate the consequences of maximizing information transfer in a simple neural network (one input layer, one output layer), focussing on the case of non linear transfer functions. We assume that both receptive fields (synaptic efficacies) and transfer functions can be adapted to the environment. The main result is that, for bounded and invertible transfer functions, in the case of a vanishing additive output noise, and no input noise, maximization of information (Linsker'sinfomax principle) leads to a factorial code  hence to the same solution as required by the redundancy reduction principle of Barlow. We show also that this result is valid for linear, more generally unbounded, transfer functions, provided optimization is performed under an additive constraint, that is which can be written as a sum of terms, each one being specific to one output neuron. Finally we study the effect of a non zero input noise. We find that, at first order in the input noise, assumed to be small ...
Learning the HigherOrder Structure of a Natural Sound
, 1996
"... Unsupervised learning algorithms paying attention only to secondorder statistics ignore the phase structure (higherorder statistics) of signals, which contains all the informative temporal and spatial coincidences which we think of as `features'. Here we discuss how an Independent Component A ..."
Abstract

Cited by 73 (7 self)
 Add to MetaCart
Unsupervised learning algorithms paying attention only to secondorder statistics ignore the phase structure (higherorder statistics) of signals, which contains all the informative temporal and spatial coincidences which we think of as `features'. Here we discuss how an Independent Component Analysis (ICA) algorithm may be used to elucidate the higherorder structure of natural signals, yielding their independent basis functions. This is illustrated with the ICA transform of the sound of a fingernail tapping musically on a tooth. The resulting independent basis functions look like the sounds themselves, having the same temporal envelopes and the same musical pitches. Thus they reflect both the phase and frequency information inherent in the data.
Independent component analysis applied to feature extraction from colour and stereo images
 Network Computation in Neural Systems
, 2000
"... Previous work has shown that independent component analysis (ICA) applied to feature extraction from natural image data yields features resembling Gabor functions and simplecell receptive fields. This article considers the effects of including chromatic and stereo information. The inclusion of colo ..."
Abstract

Cited by 63 (5 self)
 Add to MetaCart
Previous work has shown that independent component analysis (ICA) applied to feature extraction from natural image data yields features resembling Gabor functions and simplecell receptive fields. This article considers the effects of including chromatic and stereo information. The inclusion of colour leads to features divided into separate red/green, blue/yellow, and bright/dark channels. Stereo image data, on the other hand, leads to binocular receptive fields which are tuned to various disparities. The similarities between these results and observed properties of simple cells in primary visual cortex are further evidence for the hypothesis that visual cortical neurons perform some type of redundancy reduction, which was one of the original motivations for ICA in the first place. In addition, ICA provides a principled method for feature extraction from colour and stereo images; such features could be used in image processing operations such as denoising and compression, as well as in pattern recognition.
Edges are the `Independent Components' of Natural Scenes.
 in Advances in Neural Information Processing Systems
, 1996
"... Field (1994) has suggested that neurons with line and edge selectivities found in primary visual cortex of cats and monkeys form a sparse, distributed representation of natural scenes, and Barlow (1989) has reasoned that such responses should emerge from an unsupervised learning algorithm that attem ..."
Abstract

Cited by 60 (3 self)
 Add to MetaCart
(Show Context)
Field (1994) has suggested that neurons with line and edge selectivities found in primary visual cortex of cats and monkeys form a sparse, distributed representation of natural scenes, and Barlow (1989) has reasoned that such responses should emerge from an unsupervised learning algorithm that attempts to find a factorial code of independent visual features. We show here that nonlinear `infomax', when applied to an ensemble of natural scenes, produces sets of visual filters that are localised and oriented. Some of these filters are Gaborlike and resemble those produced by the sparsenessmaximisation network of Olshausen & Field (1996). In addition, the outputs of these filters are as independent as possible, since the infomax network is able to perform Independent Components Analysis (ICA). We compare the resulting ICA filters and their associated basis functions, with other decorrelating filters produced by Principal Components Analysis (PCA) and zerophase whitening filters (ZCA). ...
Explaining Away in Weight Space
, 2000
"... Explaining away has mostly been considered in terms of inference of states in belief networks. We show how it can also arise in a Bayesian context in inference about the weights governing relationships such as those between stimuli and reinforcers in conditioning experiments such as backward blo ..."
Abstract

Cited by 26 (2 self)
 Add to MetaCart
Explaining away has mostly been considered in terms of inference of states in belief networks. We show how it can also arise in a Bayesian context in inference about the weights governing relationships such as those between stimuli and reinforcers in conditioning experiments such as backward blocking. We show how explaining away in weight space can be accounted for using an extension of a Kalman filter model; provide a new approximate way of looking at the Kalman gain matrix as a whitener for the correlation matrix of the observation process; suggest a network implementation of this whitener using an architecture due to Goodall; and show that the resulting model exhibits backward blocking. 1 Introduction The phenomenon of explaining away is commonplace in inference in belief networks. In this, an explanation (a setting of activities of unobserved units) that is consistent with certain observations is accorded a low posterior probability if another explanation for the same ob...
The Exploitation of Regularities in the Environment by the Brain
 Behavioral and Brain Sciences
"... Statistical regularities of the environment are important for learning, memory, intelligence,
inductive inference, and in fact for any area of cognitive science where an informationprocessing
brain promotes survival by exploiting them. This has been recognised by many
of those interested in cognitiv ..."
Abstract

Cited by 25 (1 self)
 Add to MetaCart
Statistical regularities of the environment are important for learning, memory, intelligence,
inductive inference, and in fact for any area of cognitive science where an informationprocessing
brain promotes survival by exploiting them. This has been recognised by many
of those interested in cognitive function, starting with Helmholtz, Mach and Pearson, and
continuing through Craik, Tolman, Attneave, and Brunswik. In the current era many of us
have begun to show how neural mechanisms exploit the regular statistical properties of
natural images. Shepard proposed that the apparent trajectory of an object when seen
successively at two positions results from internalising the rules of kinematic geometry, and
although kinematic geometry is not statistical in nature, this is clearly a related idea. Here
it is argued that Shepard's term, "internalisation", is insufficient because it is also
necessary to derive an advantage from the process. Having mechanisms selectively sensitive
to the spatiotemporal patterns of excitation commonly experienced when viewing moving
objects would facilitate the detection, interpolation, and extrapolation of such motions, and
might explain the twisting motions that are experienced. Although Shepard's explanation
in terms of Chasles' rule seems doubtful, his theory and experiments illustrate that local
twisting motions are needed for the analysis of moving objects and provoke thoughts about
how they might be detected.
Analyzing Hyperspectral Data with Independent Component Analysis
 Proc. SPIE AIPR Workshop, volume 9, P.O. Box 10
, 1997
"... Hyperspectral image sensors provide images with a large number of contiguous spectral channels per pixel and enable information about different materials within a pixel to be obtained. The problem of spectrally unmixing materials may be viewed as a specific case of the blind source separation proble ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
(Show Context)
Hyperspectral image sensors provide images with a large number of contiguous spectral channels per pixel and enable information about different materials within a pixel to be obtained. The problem of spectrally unmixing materials may be viewed as a specific case of the blind source separation problem where data consists of mixed signals (in this case minerals) and the goal is to determine the contribution of each mineral to the mix without prior knowledge of the minerals in the mix. The technique of Independent Component Analysis (ICA) assumes that the spectral components are close to statistically independent and provides an unsupervised method for blind source separation. We introduce contextual ICA in the context of hyperspectral data analysis and apply the method to mineral data from synthetically mixed minerals and real image signatures. Keywords: hyperspectral, ICA, spectral unmixing, Cuprite 1. INTRODUCTION Hyperspectral image sensors provide images with a large number of cont...
Activation Functions, Computational Goals and Learning Rules for Local Processors with Contextual Guidance
 Neural Computation
, 1994
"... Information about context can enable local processors to discover latent variables that are relevant to the context within which they occur, and it can also guide shortterm processing. For example, Becker and Hinton(1992) have shown how context can guide learning, and Hummel and Biederman(1992) hav ..."
Abstract

Cited by 17 (2 self)
 Add to MetaCart
Information about context can enable local processors to discover latent variables that are relevant to the context within which they occur, and it can also guide shortterm processing. For example, Becker and Hinton(1992) have shown how context can guide learning, and Hummel and Biederman(1992) have shown how it can guide processing in a large neural net for object recognition. This paper therefore studies the basic capabilities of a local processor with two distinct classes of inputs : receptive field inputs that provide the primary drive and contextual inputs that modulate their effects. The contextual predictions are used to guide processing without confusing them with receptive field inputs. The processor's transfer function must therefore distinguish these two roles. Given these two classes of input the information in the output can be decomposed into four disjoint components to provide a space of possible goals in which the unsupervised learning of Linsker(1988) and the internal...
Contextually Guided Unsupervised Learning Using Local Multivariate Binary Processors
, 1996
"... We consider the role of contextual guidance in learning and processing within multistream neural networks. Earlier work (Kay & Phillips, 1994, 1996; Phillips et al., 1995) showed how the goals of feature discovery and associative learning could be fused within a single objective, and made preci ..."
Abstract

Cited by 16 (2 self)
 Add to MetaCart
We consider the role of contextual guidance in learning and processing within multistream neural networks. Earlier work (Kay & Phillips, 1994, 1996; Phillips et al., 1995) showed how the goals of feature discovery and associative learning could be fused within a single objective, and made precise using information theory, in such a way that local binary processors could extract a single feature that is coherent across streams. In this paper we consider multiunit local processors with multivariate binary outputs that enable a greater number of coherent features to be extracted. Using the Ising model, we define a class of informationtheoretic objective functions and also local approximations, and derive the learning rules in both cases. These rules have similarities to, and differences from, the celebrated BCM rule. Local and global versions of Infomax appear as byproducts of the general approach, as well as multivariate versions of Coherent Infomax. Focussing on the more biologicall...