Results 1 - 10
of
14
Hierarchical Bayesian Inference in the Visual Cortex
, 2002
"... this paper, we propose a Bayesian theory of hierarchical cortical computation based both on (a) the mathematical and computational ideas of computer vision and pattern the- ory and on (b) recent neurophysiological experimental evidence. We ,2 have proposed that Grenander's pattern theory 3 could pot ..."
Abstract
-
Cited by 106 (0 self)
- Add to MetaCart
this paper, we propose a Bayesian theory of hierarchical cortical computation based both on (a) the mathematical and computational ideas of computer vision and pattern the- ory and on (b) recent neurophysiological experimental evidence. We ,2 have proposed that Grenander's pattern theory 3 could potentially model the brain as a generafive model in such a way that feedback serves to disambiguate and 'explain away' the earlier representa- tion. The Helmholtz machine 4, 5 was an excellent step towards approximating this proposal, with feedback implementing priors. Its development, however, was rather limited, dealing only with binary images. Moreover, its feedback mechanisms were engaged only during the learning of the feedforward connections but not during perceptual inference, though the Gibbs sampling process for inference can potentially be interpreted as top-down feedback disambiguating low level representations? Rao and Ballard's predictive coding/Kalman filter model 6 did integrate generafive feedback in the perceptual inference process, but it was primarily a linear model and thus severely limited in practical utility. The data-driven Markov Chain Monte Carlo approach of Zhu and colleagues 7, 8 might be the most successful recent application of this proposal in solving real and difficult computer vision problems using generafive models, though its connection to the visual cortex has not been explored. Here, we bring in a powerful and widely applicable paradigm from artificial intelligence and computer vision to propose some new ideas about the algorithms of visual cortical process- ing and the nature of representations in the visual cortex. We will review some of our and others' neurophysiological experimental data to lend support to these ideas
Attending to Visual Motion
- CVIU
, 2004
"... A novel model of attentive visual motion processing is presented. A new feedforward motion-processing pyramid is described whose motivation lies in the neurobiology of primate motion processes. On this structure the Selective Tuning (ST) model for visual attention is implemented and demonstrated, sh ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
A novel model of attentive visual motion processing is presented. A new feedforward motion-processing pyramid is described whose motivation lies in the neurobiology of primate motion processes. On this structure the Selective Tuning (ST) model for visual attention is implemented and demonstrated, showing how it can localize and label simple motion patterns. There are three main contributions: 1) we present a new feed-forward motion processing hierarchy, the first to include a multi-level decomposition of processing including local spatial derivatives of velocity as a separate layer; 2) we present examples of how ST can operate on this hierarchy to localize and label motion patterns; and, 3) we present a new solution to aspects of the feature binding problem and show it to be sufficient for the task of grouping motion features into coherent object motion. This feature grouping (or binding) is accomplished using a top-down attentional selection mechanism that does not depend on a single location-based saliency representation.
A Feedback Model of Visual Attention
- JOURNAL OF COGNITIVE NEUROSCIENCE
, 2004
"... Feedback connections are a prominent feature of cortical anatomy and are likely to have significant functional role in neural information processing. We present a neural network model of cortical feedback that successfully simulates neurophysiological data associated with attention. In this domain o ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
Feedback connections are a prominent feature of cortical anatomy and are likely to have significant functional role in neural information processing. We present a neural network model of cortical feedback that successfully simulates neurophysiological data associated with attention. In this domain our model can be considered a more detailed, and biologically plausible, implementation of the biased competition model of attention. However, our model is more general as it can also explain a variety of other top-down processes in vision, such as figure/ground segmentation and contextual cueing. This model thus suggests that a common mechanism, involving cortical feedback pathways, is responsible for a range of phenomena and provides a unified account of currently disparate areas of research.
Decision-Theoretic Saliency: Computational Principles, Biological Plausibility, and Implications for Neurophysiology and Psychophysics
, 2009
"... A decision-theoretic formulation of visual saliency, first proposed for top-down processing (object recognition) (Gao & Vasconcelos, 2005a), is extended to the problem of bottom-up saliency. Under this formulation, optimality is defined in the minimum probability of error sense, under a constraint o ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
A decision-theoretic formulation of visual saliency, first proposed for top-down processing (object recognition) (Gao & Vasconcelos, 2005a), is extended to the problem of bottom-up saliency. Under this formulation, optimality is defined in the minimum probability of error sense, under a constraint of computational parsimony. The saliency of the visual features at a given location of the visual field is defined as the power of those features to discriminate between the stimulus at the location and a null hypothesis. For bottom-up saliency, this is the set of visual features that surround the location under consideration. Discrimination is defined in an information-theoretic sense and the optimal saliency detector derived for a class of stimuli that complies with known statistical properties of natural images. It is shown that under the assumption that saliency is driven by linear filtering, the optimal detector consists of what is usually referred to as the standard architecture of V1: a cascade of linear filtering, divisive normalization, rectification, and spatial pooling. The optimal detector is also shown to replicate the fundamental properties of the psychophysics of saliency: stimulus pop-out, saliency asymmetries for stimulus presence versus absence, disregard of feature conjunctions, and Weber’s law. Finally, it is shown that the optimal saliency architecture can be applied to the solution of generic inference problems. In particular, for the class of stimuli studied, it performs the three fundamental operations of statistical inference: assessment of probabilities, implementation of Bayes decision rule, and feature selection.
The Role of Early Visual Cortex in Visual Integration: A Neural Model of Recurrent Interaction
- EUROPEAN JOURNAL OF NEUROSCIENCE
, 2004
"... This paper presents a model on the potential functional roles of the early visual cortex in the primate visual system. Our hypothesis is that early visual areas, such as V1, are important for continual interaction among various higher order visual areas during visual processing. The interaction is m ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
This paper presents a model on the potential functional roles of the early visual cortex in the primate visual system. Our hypothesis is that early visual areas, such as V1, are important for continual interaction among various higher order visual areas during visual processing. The interaction is mediated by recurrent connections between higher order visual areas and V1, manifested in the longlatency context-sensitive activities often observed in neurophysiological experiments, and is responsible for the re-integration of information analysed by the higher visual areas. Specifically, we considered the case of integrating `what' and `where' information from the ventral and dorsal streams. We found that such a cortical architecture provides simple solutions and fresh insights into the problems of attentional routing and visual search. The computational viability of this architecture was tested by simulating a largescale neural dynamical network.
Statistical Correlations Between Two-Dimensional Images and Three-Dimensional Structures in Natural Scenes
, 2003
"... this paper we selected a 50scene subset of our database with spatial resolution of 22.5 # 2.5 pixels per degree. This removes from our dataset images with very high or low spatial resolutions. Extremely dark images were also discarded. The contents of our images include scans of trees and wooded are ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
this paper we selected a 50scene subset of our database with spatial resolution of 22.5 # 2.5 pixels per degree. This removes from our dataset images with very high or low spatial resolutions. Extremely dark images were also discarded. The contents of our images include scans of trees and wooded areas, rocky areas, building exteriors, and sculptures. Twenty-one images were of urban scenes and twenty-nine were of rural scenes. Each image required minutes to scan, hence only stable and stationary scenes were taken. The average size of our images was 1000 # 604 pixels, for a total of 30,177,930 pixels
Invariance and selectivity in the ventral visual pathway
"... Pattern recognition systems that are invariant to shape, pose, lighting and texture are never sufficiently selective; they suffer a high rate of “false alarms”. How are biological vision systems both invariant and selective? Specifically, how are proper arrangements of sub-patterns distinguished fro ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Pattern recognition systems that are invariant to shape, pose, lighting and texture are never sufficiently selective; they suffer a high rate of “false alarms”. How are biological vision systems both invariant and selective? Specifically, how are proper arrangements of sub-patterns distinguished from the chance arrangements that defeat selectivity in artificial systems? The answer may lie in the nonlinear dynamics that characterize complex and other invariant cell types: these cells are temporarily more receptive to some inputs than to others (functional connectivity). One consequence is that pairs of such cells with overlapping receptive fields will possess a related property that might be termed functional common input. Functional common input would induce high correlation exactly when there is a match in the sub-patterns appearing in the overlapping receptive fields. These correlations, possibly expressed as a partial and highly local synchrony, would preserve the selectivity otherwise lost to invariance.
COMPUTATIONAL NEUROSCIENCE ORIGINAL RESEARCH ARTICLE
, 2009
"... doi: 10.3389/neuro.10.004.2009 Neurophysiological bases of exponential sensory decay and top-down memory retrieval: a model ..."
Abstract
- Add to MetaCart
doi: 10.3389/neuro.10.004.2009 Neurophysiological bases of exponential sensory decay and top-down memory retrieval: a model
Online Learning for Attention, Recognition, and Tracking by a Single Developmental Framework
"... It is likely that human-level online learning for vision will require a brain-like developmental model. We present a general purpose model, called the Self-Aware and Self-Effecting (SASE) model, characterized by internal sensation and action. Rooted in the biological genomic equivalence principle, t ..."
Abstract
- Add to MetaCart
It is likely that human-level online learning for vision will require a brain-like developmental model. We present a general purpose model, called the Self-Aware and Self-Effecting (SASE) model, characterized by internal sensation and action. Rooted in the biological genomic equivalence principle, this model is a general-purpose cell-centered inplace learning scheme to handle different levels of development and operation, from the cell level all the way to the brain level. It is unknown how the brain self-organizes its internal wiring without a holistically-aware central controller. How does the brain develop internal object representations? How do such representations enable tightly intertwined attention and recognition in the presence of complex backgrounds? Internally in SASE, local neural learning uses only the co-firing between the pre-synaptic and post-synaptic activities. Such a two-way representation automatically boosts action-relevant components in the sensory inputs (e.g., foreground vs. background) by increasing the chance of only action-related feature detectors to win in competition. It enables develop in a “skull-closed” fashion. We discuss SASE networks called Where-What networks (WWN) for the open problem of general purpose online attention and recognition with complex backgrounds. In WWN, desired invariance and specificity emerge at each of the what and where motor ends without an internal master map. WWN allows both type-based top-down attention and location-based top-down attention, to attend and recognize individual objects from complex backgrounds (which may include other objects). It is proposed that WWN deals with any real-world foreground objects and any complex backgrounds. 1.

