Results 1  10
of
45
The "Independent Components" of Natural Scenes are Edge Filters
, 1997
"... It has previously been suggested that neurons with line and edge selectivities found in primary visual cortex of cats and monkeys form a sparse, distributed representation of natural scenes, and it has been reasoned that such responses should emerge from an unsupervised learning algorithm that attem ..."
Abstract

Cited by 477 (27 self)
 Add to MetaCart
It has previously been suggested that neurons with line and edge selectivities found in primary visual cortex of cats and monkeys form a sparse, distributed representation of natural scenes, and it has been reasoned that such responses should emerge from an unsupervised learning algorithm that attempts to find a factorial code of independent visual features. We show here that a new unsupervised learning algorithm based on information maximization, a nonlinear "infomax" network, when applied to an ensemble of natural scenes produces sets of visual filters that are localized and oriented. Some of these filters are Gaborlike and resemble those produced by the sparsenessmaximization network. In addition, the outputs of these filters are as independent as possible, since this infomax network performs Independent Components Analysis or ICA, for sparse (supergaussian) component distributions. We compare the resulting ICA filters and their associated basis functions, with other decorrelating filters produced by Principal Components Analysis (PCA) and zerophase whitening filters (ZCA). The ICA filters have more sparsely distributed (kurtotic) outputs on natural scenes. They also resemble the receptive fields of simple cells in visual cortex, which suggests that these neurons form a natural, informationtheoretic
Non Linear Neurons in the Low Noise Limit: A Factorial Code Maximizes Information Transfer
, 1994
"... We investigate the consequences of maximizing information transfer in a simple neural network (one input layer, one output layer), focussing on the case of non linear transfer functions. We assume that both receptive fields (synaptic efficacies) and transfer functions can be adapted to the environm ..."
Abstract

Cited by 141 (18 self)
 Add to MetaCart
We investigate the consequences of maximizing information transfer in a simple neural network (one input layer, one output layer), focussing on the case of non linear transfer functions. We assume that both receptive fields (synaptic efficacies) and transfer functions can be adapted to the environment. The main result is that, for bounded and invertible transfer functions, in the case of a vanishing additive output noise, and no input noise, maximization of information (Linsker'sinfomax principle) leads to a factorial code  hence to the same solution as required by the redundancy reduction principle of Barlow. We show also that this result is valid for linear, more generally unbounded, transfer functions, provided optimization is performed under an additive constraint, that is which can be written as a sum of terms, each one being specific to one output neuron. Finally we study the effect of a non zero input noise. We find that, at first order in the input noise, assumed to be small ...
Vector Reconstruction from Firing Rates
, 1994
"... . In a number of systems including wind detection in the cricket, visual motion perception and coding of arm movement direction in the monkey and place cell response to position in the rat hippocampus, firing rates in a population of tuned neurons are correlated with a vector quantity. We examine an ..."
Abstract

Cited by 112 (7 self)
 Add to MetaCart
. In a number of systems including wind detection in the cricket, visual motion perception and coding of arm movement direction in the monkey and place cell response to position in the rat hippocampus, firing rates in a population of tuned neurons are correlated with a vector quantity. We examine and compare several methods that allow the coded vector to be reconstructed from measured firing rates. In cases where the neuronal tuning curves resemble cosines, linear reconstruction methods work as well as more complex statistical methods requiring more detailed information about the responses of the coding neurons. We present a new linear method, the optimal linear estimator (OLE), that on average provides the best possible linear reconstruction. This method is compared with the more familiar vector method and shown to produce more accurate reconstructions using far fewer recorded neurons. Introduction To determine how information is represented by nervous systems, we need to understand ...
Biologically Plausible Errordriven Learning using Local Activation Differences: The Generalized Recirculation Algorithm
 NEURAL COMPUTATION
, 1996
"... The error backpropagation learning algorithm (BP) is generally considered biologically implausible because it does not use locally available, activationbased variables. A version of BP that can be computed locally using bidirectional activation recirculation (Hinton & McClelland, 1988) instead of ..."
Abstract

Cited by 94 (10 self)
 Add to MetaCart
The error backpropagation learning algorithm (BP) is generally considered biologically implausible because it does not use locally available, activationbased variables. A version of BP that can be computed locally using bidirectional activation recirculation (Hinton & McClelland, 1988) instead of backpropagated error derivatives is more biologically plausible. This paper presents a generalized version of the recirculation algorithm (GeneRec), which overcomes several limitations of the earlier algorithm by using a generic recurrent network with sigmoidal units that can learn arbitrary input/output mappings. However, the contrastiveHebbian learning algorithm (CHL, a.k.a. DBM or mean field learning) also uses local variables to perform errordriven learning in a sigmoidal recurrent network. CHL was derived in a stochastic framework (the Boltzmann machine), but has been extended to the deterministic case in various ways, all of which rely on problematic approximationsand assumptions, le...
A Unifying Informationtheoretic Framework for Independent Component Analysis
, 1999
"... We show that different theories recently proposed for Independent Component Analysis (ICA) lead to the same iterative learning algorithm for blind separation of mixed independent sources. We review those theories and suggest that information theory can be used to unify several lines of research. Pea ..."
Abstract

Cited by 82 (8 self)
 Add to MetaCart
We show that different theories recently proposed for Independent Component Analysis (ICA) lead to the same iterative learning algorithm for blind separation of mixed independent sources. We review those theories and suggest that information theory can be used to unify several lines of research. Pearlmutter and Parra (1996) and Cardoso (1997) showed that the infomax approach of Bell and Sejnowski (1995) and the maximum likelihood estimation approach are equivalent. We show that negentropy maximization also has equivalent properties and therefore all three approaches yield the same learning rule for a fixed nonlinearity. Girolami and Fyfe (1997a) have shown that the nonlinear Principal Component Analysis (PCA) algorithm of Karhunen and Joutsensalo (1994) and Oja (1997) can also be viewed from informationtheoretic principles since it minimizes the sum of squares of the fourthorder marginal cumulants and therefore approximately minimizes the mutual information (Comon, 1994). Lambert (19...
Functionally independent components of the late positive eventrelated potential during visual spatial attention
 J. NEUROSCI
, 1999
"... Human eventrelated potentials (ERPs) were recorded from 10 subjects presented with visual target and nontarget stimuli at five screen locations and responding to targets presented at one of the locations. The late positive response complexes of 25–75 ERP average waveforms from the two task conditio ..."
Abstract

Cited by 53 (19 self)
 Add to MetaCart
Human eventrelated potentials (ERPs) were recorded from 10 subjects presented with visual target and nontarget stimuli at five screen locations and responding to targets presented at one of the locations. The late positive response complexes of 25–75 ERP average waveforms from the two task conditions were simultaneously analyzed with Independent Component Analysis, a new computational method for blindly separating linearly mixed signals. Three spatially fixed, temporally independent, behaviorally relevant, and physiologically plausible components were identified without reference to peaks in singlechannel waveforms. A novel frontoparietal component (P3f) began at �140 msec and peaked, in faster responders, at the onset of the motor command. The scalp distribution of P3f appeared consistent with brain regions activated during spatial orienting in functional imaging experiments. A longerlatency
Visual Adaptation as Optimal Information Transmission
 Vision Research
, 1999
"... We propose that visual adaptation in orientation, spatial frequency, and motion can be understood from the perspective of optimal information transmission. The essence of the proposal is that neural response properties at the system level should be adjusted to the changing statistics of the input so ..."
Abstract

Cited by 30 (1 self)
 Add to MetaCart
We propose that visual adaptation in orientation, spatial frequency, and motion can be understood from the perspective of optimal information transmission. The essence of the proposal is that neural response properties at the system level should be adjusted to the changing statistics of the input so as to maximize information transmission. We show that this principle accounts for several welldocumented psychophysical phenomena, including the tilt aftereffect, change in contrast sensitivity and postadaptation changes in orientation discrimination. Adaptation can also be considered on a longer time scale, in the context of tailoring response properties to natural scene statistics. From the anisotropic distribution of power in natural scenes, the proposal also predicts differences in the contrast sensitivity function across spatial frequency and orientation, including the oblique effect. 1999 Elsevier Science Ltd. All rights reserved. Keywords: Adaptation; Aftereffects; Signaltonoise ratio; Sensitivity; Natural scenes www.elsevier.com/locate/visres 1.
Decoding Neuronal Firing And Modeling Neural Networks
 Quart. Rev. Biophys
, 1994
"... Introduction Biological neural networks are large systems of complex elements interacting through a complex array of connections. Individual neurons express a large number of active conductances (Connors et al., 1982; Adams & Gavin, 1986; Llin'as, 1988; McCormick, 1990; Hille, 1992) and exhibit a w ..."
Abstract

Cited by 25 (4 self)
 Add to MetaCart
Introduction Biological neural networks are large systems of complex elements interacting through a complex array of connections. Individual neurons express a large number of active conductances (Connors et al., 1982; Adams & Gavin, 1986; Llin'as, 1988; McCormick, 1990; Hille, 1992) and exhibit a wide variety of dynamic behaviors on time scales ranging from milliseconds to many minutes (Llin'as, 1988; HarrisWarrick & Marder, 1991; Churchland & Sejnowski, 1992; Turrigiano et al., 1994). Neurons in cortical circuits are typically coupled to thousands of other neurons (Stevens, 1989) and very little is known about the strengths of these synapses (although see Rosenmund et al., 1993; Hessler et al., 1993; Smetters & Nelson, 1993). The complex firing patterns of large neuronal populations are difficult to describe let alone understand. There is little point in accurately modeling each membrane potential in a large neural
A Geometric Algorithm for Overcomplete Linear ICA
 NEUROCOMPUTING
, 2003
"... Geometric algorithms for linear quadratic independent component analysis (ICA) have recently received some attention due to their pictorial description and their relative ease of implementation. The geometric approach to ICA has been proposed first by Puntonet and Prieto [1] [2] in order to separate ..."
Abstract

Cited by 23 (11 self)
 Add to MetaCart
Geometric algorithms for linear quadratic independent component analysis (ICA) have recently received some attention due to their pictorial description and their relative ease of implementation. The geometric approach to ICA has been proposed first by Puntonet and Prieto [1] [2] in order to separate linear mixtures. We generalize these algorithms to overcomplete cases with more sources than sensors. With geometric ICA we get an efficient method for the matrixrecovery step in the framework of a twostep approach to the source separation problem. The second step  sourcerecovery  uses a maximumlikelihood approach. There we prove that the shortestpath algorithm as proposed by Bofill and Zibulevsky in [3] indeed solves the maximumlikelihood conditions.
Searching for Filters With "Interesting" Output Distributions: An Uninteresting Direction to Explore?
 Network
, 1996
"... . It has been proposed that the receptive fields of neurons in V1 are optimised to generate "sparse", Kurtotic, or "interesting" output probability distributions (Barlow & Tolhurst, 1992; Barlow, 1994; Field, 1994; Intrator & Cooper, 1991; Intrator, 1992). We investigate the empirical evidence for t ..."
Abstract

Cited by 23 (1 self)
 Add to MetaCart
. It has been proposed that the receptive fields of neurons in V1 are optimised to generate "sparse", Kurtotic, or "interesting" output probability distributions (Barlow & Tolhurst, 1992; Barlow, 1994; Field, 1994; Intrator & Cooper, 1991; Intrator, 1992). We investigate the empirical evidence for this further and argue that filters can produce "interesting" output distributions simply because natural images have variable local intensity variance. If the proposed filters have zero D.C., then the probability distribution of filter outputs (and hence the output Kurtosis) is well predicted simply from these effects of variable local variance. This suggests that finding filters with high output Kurtosis does not necessarily signal interesting image structure. It is then argued that finding filters that maximise output Kurtosis generates filters that are incompatible with observed physiology. In particular the optimal differenceofGaussian (DOG) filter should have the smallest possible s...