Results 1  10
of
41
Sparse coding with an overcomplete basis set: a strategy employed by V1
 Vision Research
, 1997
"... The spatial receptive fields of simple cells in mammalian striate cortex have been reasonably well described physiologically and can be characterized as being localized, oriented, and ban@ass, comparable with the basis functions of wavelet transforms. Previously, we have shown that these receptive f ..."
Abstract

Cited by 591 (7 self)
 Add to MetaCart
The spatial receptive fields of simple cells in mammalian striate cortex have been reasonably well described physiologically and can be characterized as being localized, oriented, and ban@ass, comparable with the basis functions of wavelet transforms. Previously, we have shown that these receptive field properties may be accounted for in terms of a strategy for producing a sparse distribution of output activity in response to natural images. Here, in addition to describing this work in a more expansive fashion, we examine the neurobiological implications of sparse coding. Of particular interest is the case when the code is overcompletei.e., when the number of code elements is greater than the effective dimensionality of the input space. Because the basis functions are nonorthogonal and not linearly independent of each other, sparsifying the code will recruit only those basis functions necessary for representing a given input, and so the inputoutput function will deviate from being purely linear. These deviations from linearity provide a potential explanation for the weak forms of nonlinearity observed in the response properties of cortical simple cells, and they further make predictions about the expected interactions among units in
The "Independent Components" of Natural Scenes are Edge Filters
, 1997
"... It has previously been suggested that neurons with line and edge selectivities found in primary visual cortex of cats and monkeys form a sparse, distributed representation of natural scenes, and it has been reasoned that such responses should emerge from an unsupervised learning algorithm that attem ..."
Abstract

Cited by 477 (27 self)
 Add to MetaCart
It has previously been suggested that neurons with line and edge selectivities found in primary visual cortex of cats and monkeys form a sparse, distributed representation of natural scenes, and it has been reasoned that such responses should emerge from an unsupervised learning algorithm that attempts to find a factorial code of independent visual features. We show here that a new unsupervised learning algorithm based on information maximization, a nonlinear "infomax" network, when applied to an ensemble of natural scenes produces sets of visual filters that are localized and oriented. Some of these filters are Gaborlike and resemble those produced by the sparsenessmaximization network. In addition, the outputs of these filters are as independent as possible, since this infomax network performs Independent Components Analysis or ICA, for sparse (supergaussian) component distributions. We compare the resulting ICA filters and their associated basis functions, with other decorrelating filters produced by Principal Components Analysis (PCA) and zerophase whitening filters (ZCA). The ICA filters have more sparsely distributed (kurtotic) outputs on natural scenes. They also resemble the receptive fields of simple cells in visual cortex, which suggests that these neurons form a natural, informationtheoretic
Neuronal Architectures for Patterntheoretic Problems
 LargeScale Theories of the Cortex
, 1994
"... this paper is the proposition that the computational analysis of vision  and speech, tactile sensing, motor control, etc.  (the theory of the computation as Marr called it (Marr, 82)) has is reaching a point where it can provide a clearer and deeper description of the essential tasks of vision a ..."
Abstract

Cited by 79 (1 self)
 Add to MetaCart
this paper is the proposition that the computational analysis of vision  and speech, tactile sensing, motor control, etc.  (the theory of the computation as Marr called it (Marr, 82)) has is reaching a point where it can provide a clearer and deeper description of the essential tasks of vision as well as a wide range of other cognitive tasks. For instance, the development of algorithms for character recognition or for face recognition or for road tracking from a moving vehicle (three problems which have been much studied on account of their potential applications) forces the researcher to deal with noisy, complex real world data. In doing this, one's initial ideas about what parts of the problem are difficult, what parts are simple, may turn out to be quite wrong. Quite often, a step which one thinks of as a simple preprocessing clean up operation turns out to be very difficult and pinpoints for you a new class of problems which had been ignored. Introspection turns out often to be very poor guide to the complexity of a problem. The reason for this, we believe, is our subjective impression of perceiving instantaneously and effortlessly the significance of sensory patterns, e.g. the word being spoken or which face is being seen. Many psychological experiments however have shown that what we perceive is not the true sensory signal, but a rational reconstruction of what the signal should be. This means that the messy ambiguous raw signal never makes it to our consciousness but gets overlaid with a clearly and precisely patterned version which could never have been computed without the extensive use of memories, expectations and logic. Only when you attempt to duplicate such a skill by computer do you discover all the hidden complexity in the computation. We believe ...
Learning the HigherOrder Structure of a Natural Sound
, 1996
"... Unsupervised learning algorithms paying attention only to secondorder statistics ignore the phase structure (higherorder statistics) of signals, which contains all the informative temporal and spatial coincidences which we think of as `features'. Here we discuss how an Independent Component Analys ..."
Abstract

Cited by 64 (7 self)
 Add to MetaCart
Unsupervised learning algorithms paying attention only to secondorder statistics ignore the phase structure (higherorder statistics) of signals, which contains all the informative temporal and spatial coincidences which we think of as `features'. Here we discuss how an Independent Component Analysis (ICA) algorithm may be used to elucidate the higherorder structure of natural signals, yielding their independent basis functions. This is illustrated with the ICA transform of the sound of a fingernail tapping musically on a tooth. The resulting independent basis functions look like the sounds themselves, having the same temporal envelopes and the same musical pitches. Thus they reflect both the phase and frequency information inherent in the data.
Wavelets, vision and the statistics of natural scenes
 71 Academy of Science, Engineering and Technology 26 2007
, 1999
"... The processing of spatial information by the visual system shows a number of similarities to the wavelet transforms that have become popular in applied mathematics. Over the last decade, a range of studies has focused on the question of ‘why ’ the visual system would evolve this strategy of coding s ..."
Abstract

Cited by 31 (0 self)
 Add to MetaCart
The processing of spatial information by the visual system shows a number of similarities to the wavelet transforms that have become popular in applied mathematics. Over the last decade, a range of studies has focused on the question of ‘why ’ the visual system would evolve this strategy of coding spatial information. One such approach has focused on the relationship between the visual code and the statistics of natural scenes under the assumption that the visual system has evolved this strategy as a means of optimizing the representation of its visual environment. This paper reviews some of this literature and looks at some of the statistical properties of natural scenes that allow this code to be efficient. It is argued that such wavelet codes are efficient because they increase the independence of the vectors ’ outputs (i.e. they increase the independence of the responses of the visual neurons) by finding the sparse structure available in the input. Studies with neural networks that attempt to maximize the ‘sparsity ’ of the representation have been shown to produce vectors (neural receptive fields) that have many of the properties of a wavelet representation. It is argued that the visual environment has the appropriate sparse structure to make this sparse output possible. It is argued that these sparse/independent representations make it computationally easier to detect and represent the higherorder structure present in complex environmental data.
Searching for Filters With "Interesting" Output Distributions: An Uninteresting Direction to Explore?
 Network
, 1996
"... . It has been proposed that the receptive fields of neurons in V1 are optimised to generate "sparse", Kurtotic, or "interesting" output probability distributions (Barlow & Tolhurst, 1992; Barlow, 1994; Field, 1994; Intrator & Cooper, 1991; Intrator, 1992). We investigate the empirical evidence for t ..."
Abstract

Cited by 23 (1 self)
 Add to MetaCart
. It has been proposed that the receptive fields of neurons in V1 are optimised to generate "sparse", Kurtotic, or "interesting" output probability distributions (Barlow & Tolhurst, 1992; Barlow, 1994; Field, 1994; Intrator & Cooper, 1991; Intrator, 1992). We investigate the empirical evidence for this further and argue that filters can produce "interesting" output distributions simply because natural images have variable local intensity variance. If the proposed filters have zero D.C., then the probability distribution of filter outputs (and hence the output Kurtosis) is well predicted simply from these effects of variable local variance. This suggests that finding filters with high output Kurtosis does not necessarily signal interesting image structure. It is then argued that finding filters that maximise output Kurtosis generates filters that are incompatible with observed physiology. In particular the optimal differenceofGaussian (DOG) filter should have the smallest possible s...
Unsupervised Neural Network Learning Procedures . . .
, 1996
"... In this article, we review unsupervised neural network learning procedures which can be applied to the task of preprocessing raw data to extract useful features for subsequent classification. The learning algorithms reviewed here are grouped into three sections: informationpreserving methods, densi ..."
Abstract

Cited by 23 (1 self)
 Add to MetaCart
In this article, we review unsupervised neural network learning procedures which can be applied to the task of preprocessing raw data to extract useful features for subsequent classification. The learning algorithms reviewed here are grouped into three sections: informationpreserving methods, density estimation methods, and feature extraction methods. Each of these major sections concludes with a discussion of successful applications of the methods to realworld problems.
Receptive Fields and Maps in the Visual Cortex: Models of Ocular Dominance and Orientation Columns
, 1996
"... The formation of ocular dominance and orientation columns in the mammalian visual cortex is briefly reviewed. Correlationbased models for their development are then discussed, beginning with the models of Von der Malsburg. For the case of semilinear models, model behavior is well understood: c ..."
Abstract

Cited by 22 (2 self)
 Add to MetaCart
The formation of ocular dominance and orientation columns in the mammalian visual cortex is briefly reviewed. Correlationbased models for their development are then discussed, beginning with the models of Von der Malsburg. For the case of semilinear models, model behavior is well understood: correlations determine receptive field structure, intracortical interactions determine projective field structure, and the "knitting together" of the two determines the cortical map. This provides a basis for simple but powerful models of ocular dominance and orientation column formation: ocular dominance columns form through a correlationbased competition between lefteye and righteye inputs, while orientation columns can form through a competition between ONcenter and OFFcenter inputs. These models account well for receptive field structure, but are not completely adequate to account for the details of cortical map structure. Alternative approaches to map structure, including the...
Unsupervised Discrimination of Clustered Data via Optimization of Binary Information Gain
 Advances in Neural Information Processing Systems
, 1993
"... We present the informationtheoretic derivation of a learning algorithm that clusters unlabelled data with linear discriminants. In contrast to methods that try to preserve information about the input patterns, we maximize the information gained from observing the output of robust binary discriminat ..."
Abstract

Cited by 21 (9 self)
 Add to MetaCart
We present the informationtheoretic derivation of a learning algorithm that clusters unlabelled data with linear discriminants. In contrast to methods that try to preserve information about the input patterns, we maximize the information gained from observing the output of robust binary discriminators implemented with sigmoid nodes. We derive a local weight adaptation rule via gradient ascent in this objective, demonstrate its dynamics on some simple data sets, relate our approach to previous work and suggest directions in which it may be extended.
Manifold Pursuit: A New Approach to Appearance Based Recognition
 INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR
, 2002
"... Manifold Pursuit (MP) extends Principal Component Analysis to be invariant to a desired group of imageplane transformations of an ensemble of unaligned images. We derive a ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
Manifold Pursuit (MP) extends Principal Component Analysis to be invariant to a desired group of imageplane transformations of an ensemble of unaligned images. We derive a