Results 1 - 10
of
38
Sparse coding with an overcomplete basis set: a strategy employed by V1
- Vision Research
, 1997
"... The spatial receptive fields of simple cells in mammalian striate cortex have been reasonably well described physiologically and can be characterized as being localized, oriented, and ban@ass, comparable with the basis functions of wavelet transforms. Previously, we have shown that these receptive f ..."
Abstract
-
Cited by 427 (6 self)
- Add to MetaCart
The spatial receptive fields of simple cells in mammalian striate cortex have been reasonably well described physiologically and can be characterized as being localized, oriented, and ban@ass, comparable with the basis functions of wavelet transforms. Previously, we have shown that these receptive field properties may be accounted for in terms of a strategy for producing a sparse distribution of output activity in response to natural images. Here, in addition to describing this work in a more expansive fashion, we examine the neurobiological implications of sparse coding. Of particular interest is the case when the code is overcomplete--i.e., when the number of code elements is greater than the effective dimensionality of the input space. Because the basis functions are non-orthogonal and not linearly independent of each other, sparsifying the code will recruit only those basis functions necessary for representing a given input, and so the input-output function will deviate from being purely linear. These deviations from linearity provide a potential explanation for the weak forms of non-linearity observed in the response properties of cortical simple cells, and they further make predictions about the expected interactions among units in
The "Independent Components" of Natural Scenes are Edge Filters
, 1997
"... It has previously been suggested that neurons with line and edge selectivities found in primary visual cortex of cats and monkeys form a sparse, distributed representation of natural scenes, and it has been reasoned that such responses should emerge from an unsupervised learning algorithm that attem ..."
Abstract
-
Cited by 381 (24 self)
- Add to MetaCart
It has previously been suggested that neurons with line and edge selectivities found in primary visual cortex of cats and monkeys form a sparse, distributed representation of natural scenes, and it has been reasoned that such responses should emerge from an unsupervised learning algorithm that attempts to find a factorial code of independent visual features. We show here that a new unsupervised learning algorithm based on information maximization, a nonlinear "infomax" network, when applied to an ensemble of natural scenes produces sets of visual filters that are localized and oriented. Some of these filters are Gabor-like and resemble those produced by the sparseness-maximization network. In addition, the outputs of these filters are as independent as possible, since this infomax network performs Independent Components Analysis or ICA, for sparse (super-gaussian) component distributions. We compare the resulting ICA filters and their associated basis functions, with other decorrelating filters produced by Principal Components Analysis (PCA) and zero-phase whitening filters (ZCA). The ICA filters have more sparsely distributed (kurtotic) outputs on natural scenes. They also resemble the receptive fields of simple cells in visual cortex, which suggests that these neurons form a natural, information-theoretic
Neuronal Architectures for Pattern-theoretic Problems
- Large-Scale Theories of the Cortex
, 1994
"... this paper is the proposition that the computational analysis of vision -- and speech, tactile sensing, motor control, etc. -- (the theory of the computation as Marr called it (Marr, 82)) has is reaching a point where it can provide a clearer and deeper description of the essential tasks of vision a ..."
Abstract
-
Cited by 65 (1 self)
- Add to MetaCart
this paper is the proposition that the computational analysis of vision -- and speech, tactile sensing, motor control, etc. -- (the theory of the computation as Marr called it (Marr, 82)) has is reaching a point where it can provide a clearer and deeper description of the essential tasks of vision as well as a wide range of other cognitive tasks. For instance, the development of algorithms for character recognition or for face recognition or for road tracking from a moving vehicle (three problems which have been much studied on account of their potential applications) forces the researcher to deal with noisy, complex real world data. In doing this, one's initial ideas about what parts of the problem are difficult, what parts are simple, may turn out to be quite wrong. Quite often, a step which one thinks of as a simple pre-processing clean up operation turns out to be very difficult and pinpoints for you a new class of problems which had been ignored. Introspection turns out often to be very poor guide to the complexity of a problem. The reason for this, we believe, is our subjective impression of perceiving instantaneously and effortlessly the significance of sensory patterns, e.g. the word being spoken or which face is being seen. Many psychological experiments however have shown that what we perceive is not the true sensory signal, but a rational reconstruction of what the signal should be. This means that the messy ambiguous raw signal never makes it to our consciousness but gets overlaid with a clearly and precisely patterned version which could never have been computed without the extensive use of memories, expectations and logic. Only when you attempt to duplicate such a skill by computer do you discover all the hidden complexity in the computation. We believe ...
Learning the Higher-Order Structure of a Natural Sound
, 1996
"... Unsupervised learning algorithms paying attention only to second-order statistics ignore the phase structure (higher-order statistics) of signals, which contains all the informative temporal and spatial coincidences which we think of as `features'. Here we discuss how an Independent Component Analys ..."
Abstract
-
Cited by 57 (7 self)
- Add to MetaCart
Unsupervised learning algorithms paying attention only to second-order statistics ignore the phase structure (higher-order statistics) of signals, which contains all the informative temporal and spatial coincidences which we think of as `features'. Here we discuss how an Independent Component Analysis (ICA) algorithm may be used to elucidate the higher-order structure of natural signals, yielding their independent basis functions. This is illustrated with the ICA transform of the sound of a fingernail tapping musically on a tooth. The resulting independent basis functions look like the sounds themselves, having the same temporal envelopes and the same musical pitches. Thus they reflect both the phase and frequency information inherent in the data.
Wavelets, vision and the statistics of natural scenes
- 71 Academy of Science, Engineering and Technology 26 2007
, 1999
"... The processing of spatial information by the visual system shows a number of similarities to the wavelet transforms that have become popular in applied mathematics. Over the last decade, a range of studies has focused on the question of ‘why ’ the visual system would evolve this strategy of coding s ..."
Abstract
-
Cited by 24 (0 self)
- Add to MetaCart
The processing of spatial information by the visual system shows a number of similarities to the wavelet transforms that have become popular in applied mathematics. Over the last decade, a range of studies has focused on the question of ‘why ’ the visual system would evolve this strategy of coding spatial information. One such approach has focused on the relationship between the visual code and the statistics of natural scenes under the assumption that the visual system has evolved this strategy as a means of optimizing the representation of its visual environment. This paper reviews some of this literature and looks at some of the statistical properties of natural scenes that allow this code to be efficient. It is argued that such wavelet codes are efficient because they increase the independence of the vectors ’ outputs (i.e. they increase the independence of the responses of the visual neurons) by finding the sparse structure available in the input. Studies with neural networks that attempt to maximize the ‘sparsity ’ of the representation have been shown to produce vectors (neural receptive fields) that have many of the properties of a wavelet representation. It is argued that the visual environment has the appropriate sparse structure to make this sparse output possible. It is argued that these sparse/independent representations make it computationally easier to detect and represent the higher-order structure present in complex environmental data.
Unsupervised Neural Network Learning Procedures . . .
, 1996
"... In this article, we review unsupervised neural network learning procedures which can be applied to the task of preprocessing raw data to extract useful features for subsequent classification. The learning algorithms reviewed here are grouped into three sections: information-preserving methods, densi ..."
Abstract
-
Cited by 21 (1 self)
- Add to MetaCart
In this article, we review unsupervised neural network learning procedures which can be applied to the task of preprocessing raw data to extract useful features for subsequent classification. The learning algorithms reviewed here are grouped into three sections: information-preserving methods, density estimation methods, and feature extraction methods. Each of these major sections concludes with a discussion of successful applications of the methods to real-world problems.
Unsupervised Discrimination of Clustered Data via Optimization of Binary Information Gain
- Advances in Neural Information Processing Systems
, 1993
"... We present the information-theoretic derivation of a learning algorithm that clusters unlabelled data with linear discriminants. In contrast to methods that try to preserve information about the input patterns, we maximize the information gained from observing the output of robust binary discriminat ..."
Abstract
-
Cited by 20 (8 self)
- Add to MetaCart
We present the information-theoretic derivation of a learning algorithm that clusters unlabelled data with linear discriminants. In contrast to methods that try to preserve information about the input patterns, we maximize the information gained from observing the output of robust binary discriminators implemented with sigmoid nodes. We derive a local weight adaptation rule via gradient ascent in this objective, demonstrate its dynamics on some simple data sets, relate our approach to previous work and suggest directions in which it may be extended.
Receptive Fields and Maps in the Visual Cortex: Models of Ocular Dominance and Orientation Columns
, 1996
"... The formation of ocular dominance and orientation columns in the mammalian visual cortex is briefly reviewed. Correlation-based models for their development are then discussed, beginning with the models of Von der Malsburg. For the case of semi-linear models, model behavior is well understood: c ..."
Abstract
-
Cited by 20 (2 self)
- Add to MetaCart
The formation of ocular dominance and orientation columns in the mammalian visual cortex is briefly reviewed. Correlation-based models for their development are then discussed, beginning with the models of Von der Malsburg. For the case of semi-linear models, model behavior is well understood: correlations determine receptive field structure, intracortical interactions determine projective field structure, and the "knitting together" of the two determines the cortical map. This provides a basis for simple but powerful models of ocular dominance and orientation column formation: ocular dominance columns form through a correlationbased competition between left-eye and right-eye inputs, while orientation columns can form through a competition between ON-center and OFF-center inputs. These models account well for receptive field structure, but are not completely adequate to account for the details of cortical map structure. Alternative approaches to map structure, including the...
Searching for Filters With "Interesting" Output Distributions: An Uninteresting Direction to Explore?
- Network
, 1996
"... . It has been proposed that the receptive fields of neurons in V1 are optimised to generate "sparse", Kurtotic, or "interesting" output probability distributions (Barlow & Tolhurst, 1992; Barlow, 1994; Field, 1994; Intrator & Cooper, 1991; Intrator, 1992). We investigate the empirical evidence for t ..."
Abstract
-
Cited by 20 (1 self)
- Add to MetaCart
. It has been proposed that the receptive fields of neurons in V1 are optimised to generate "sparse", Kurtotic, or "interesting" output probability distributions (Barlow & Tolhurst, 1992; Barlow, 1994; Field, 1994; Intrator & Cooper, 1991; Intrator, 1992). We investigate the empirical evidence for this further and argue that filters can produce "interesting" output distributions simply because natural images have variable local intensity variance. If the proposed filters have zero D.C., then the probability distribution of filter outputs (and hence the output Kurtosis) is well predicted simply from these effects of variable local variance. This suggests that finding filters with high output Kurtosis does not necessarily signal interesting image structure. It is then argued that finding filters that maximise output Kurtosis generates filters that are incompatible with observed physiology. In particular the optimal difference--of--Gaussian (DOG) filter should have the smallest possible s...
Combining Exploratory Projection Pursuit And Projection Pursuit Regression With Application To Neural Networks
- Neural Computation
, 1992
"... We present a novel classification and regression method that combines exploratory projection pursuit (unsupervised training) with projection pursuit regression (supervised training), to yield a new family of cost/complexity penalty terms. Some improved generalization properties are demonstrated on r ..."
Abstract
-
Cited by 16 (9 self)
- Add to MetaCart
We present a novel classification and regression method that combines exploratory projection pursuit (unsupervised training) with projection pursuit regression (supervised training), to yield a new family of cost/complexity penalty terms. Some improved generalization properties are demonstrated on real world problems. 1 Introduction Parameter estimation becomes difficult in high-dimensional spaces due to the increasing sparseness of the data. Therefore, when a low dimensional representation is embedded in the data, dimensionality reduction methods become useful. One such method -- projection pursuit regression (Friedman and Stuetzle, 1981) (PPR) is capable of performing dimensionality reduction by composition, namely, it constructs an approximation to the desired response function using a composition of lower dimensional smooth functions. These functions depend on low dimensional projections through the data. When the dimensionality of the problem is in the thousands, even projection...

