Results 1 - 10
of
37
Color indexing
- International Journal of Computer Vision
, 1991
"... Computer vision is embracing a new research focus in which the aim is to develop visual skills for robots that allow them to interact with a dynamic, realistic environment. To achieve this aim, new kinds of vision algorithms need to be developed which run in real time and subserve the robot's goals. ..."
Abstract
-
Cited by 1124 (23 self)
- Add to MetaCart
Computer vision is embracing a new research focus in which the aim is to develop visual skills for robots that allow them to interact with a dynamic, realistic environment. To achieve this aim, new kinds of vision algorithms need to be developed which run in real time and subserve the robot's goals. Two fundamental goals are determin-ing the location of a known object. Color can be successfully used for both tasks. This article demonstrates that color histograms of multicolored objects provide a robust, efficient cue for index-ing into a large database of models. It shows that color histograms are stable object representations in the presence of occlusion and over change in view, and that they can differentiate among a large number of objects. For solving the identification problem, it introduces a technique called Histogram Intersection, which matches model and im-age histograms and a fast incremental version of Histogram Intersection, which allows real-time indexing into a large database of stored models. For solving the location problem it introduces an algorithm called Histogram Backprojection, which performs this task efficiently in crowded scenes. 1
An Active Vision Architecture based on Iconic Representations
- Artificial Intelligence
, 1995
"... Active vision systems have the capability of continuously interacting with the environment. The rapidly changing environment of such systems means that it is attractive to replace static representations with visual routines that compute information on demand. Such routines place a premium on image d ..."
Abstract
-
Cited by 116 (12 self)
- Add to MetaCart
Active vision systems have the capability of continuously interacting with the environment. The rapidly changing environment of such systems means that it is attractive to replace static representations with visual routines that compute information on demand. Such routines place a premium on image data structures that are easily computed and used. The purpose of this paper is to propose a general active vision architecture based on efficiently computable iconic representations. This architecture employs two primary visual routines, one for identifying the visual image near the fovea (object identification), and another for locating a stored prototype on the retina (object location). This design allows complex visual behaviors to be obtained by composing these two routines with different parameters. The iconic representations are comprised of high-dimensional feature vectors obtained from the responses of an ensemble of Gaussian derivative spatial filters at a number of orientations and...
Conjunction search revisited
- Journal of Experimental Psychology: Human Perception and Performance
, 1990
"... Search for conjunctions of highly discriminable features can be rapid or even parallel. This article explores, three possible accounts based on (a) perceptual segregation, (b) conjunction detectors, and (c) inhibition controlled separately by two or more distractor features. Search rates for conjunc ..."
Abstract
-
Cited by 86 (1 self)
- Add to MetaCart
Search for conjunctions of highly discriminable features can be rapid or even parallel. This article explores, three possible accounts based on (a) perceptual segregation, (b) conjunction detectors, and (c) inhibition controlled separately by two or more distractor features. Search rates for conjunctions of color, size, orientation, and direction of motion correlated closely with an independent measure of perceptual segregation. However, they appeared unrelated to the physi-ology of single-unit responses. Each dimension contributed additively to conjunction search rates, suggesting that each was checked independently of the others. Unknown targets appear to be found only by serial search for each in turn. Searching through 4 sets of distractors was slower than searching through 2. The results suggest a modification of feature integration theory, in which attention is controlled not only by a unitary "window " but also by a form of feature-based inhibition. Objects in the real world vary in a large number of prop-erties, at least some of which appear to be coded by special-ized, independent channels or modules in the perceptual
Dynamic Model of Visual Recognition Predicts Neural Response Properties in the Visual Cortex
- Neural Computation
, 1995
"... this paper, we describe a hierarchical network model of visual recognition that explains these experimental observations by using a form of the extended Kalman filter as given by the Minimum Description Length (MDL) principle. The model dynamically combines input-driven bottom-up signals with expec ..."
Abstract
-
Cited by 77 (20 self)
- Add to MetaCart
this paper, we describe a hierarchical network model of visual recognition that explains these experimental observations by using a form of the extended Kalman filter as given by the Minimum Description Length (MDL) principle. The model dynamically combines input-driven bottom-up signals with expectation-driven top-down signals to predict current recognition state. Synaptic weights in the model are adapted in a Hebbian manner according to a learning rule also derived from the MDL principle. The resulting prediction/learning scheme can be viewed as implementing a form of the Expectation-Maximization (EM) algorithm. The architecture of the model posits an active computational role for the reciprocal connections between adjoining visual cortical areas in determining neural response properties. In particular, the model demonstrates the possible role of feedback from higher cortical areas in mediating neurophysiological effects due to stimuli from beyond the classical receptive field. Si
Biological constraints on connectionist modelling
- Connectionism in Perspective
, 1989
"... Many researchers interested in connectionist models accept that such models are "neurally inspired " but do not worry too much about whether their models are biologically realistic. While such a position may be perfectly justifiable, the present paper attempts to illustrate how biological ..."
Abstract
-
Cited by 56 (5 self)
- Add to MetaCart
Many researchers interested in connectionist models accept that such models are "neurally inspired " but do not worry too much about whether their models are biologically realistic. While such a position may be perfectly justifiable, the present paper attempts to illustrate how biological information can be used to constrain connectionist models. Two particular areas are discussed. The first section deals with visual information processing in the primate and human visual system. It is argued that speed with which visual information is processed imposes major constraints on the architecture and operation of the visual system. In particular, it seems that a great deal of processing must depend on a single bottum-up pass. The second section deals with biological aspects of learning algorithms. It is argued that although there is good evidence for certain coactivation related synaptic modification schemes, other learning mechanisms, including back-propagation, are not currently supported by experimental data.
A selective impairment of motion perception following lesions of the middle temporal visual area (MT
- Journal of Neuroscience
, 1988
"... Physiological experiments indicate that the middle temporal visual area (MT) of primates plays a prominent role in the cortical analysis of visual motion. We investigated the role of MT in visual perception by examining the effect of chem-ical lesions of MT on psychophysical thresholds. We trained r ..."
Abstract
-
Cited by 56 (1 self)
- Add to MetaCart
Physiological experiments indicate that the middle temporal visual area (MT) of primates plays a prominent role in the cortical analysis of visual motion. We investigated the role of MT in visual perception by examining the effect of chem-ical lesions of MT on psychophysical thresholds. We trained rhesus monkeys on psychophysical tasks that enabled us to assess their sensitivity to motion and to contrast. For motion psychophysics, we employed a dynamic random dot display that permitted us to vary the intensity of a motion signal in the midst of masking motion noise. We measured the threshold intensity for which the monkey could suc-cessfully complete a direction discrimination. In the contrast task, we measured the threshold contrast for which the mon-keys could successfully discriminate the orientation of sta-tionary gratings. Injections of ibotenic acid into MT caused striking elevations in motion thresholds, but had little or no effect on contrast thresholds. The results indicate that neural activity in MT contributes selectively to the perception of motion. Extrastriate visual cortex in primates comprises a mosaic of visual areas that can be distinguished on the basis of visual topography, anatomical connections, cortical architecture, and physiological response properties. A growing corpus of data in-dicates that several of these areas are specialized for the analysis of visual motion information. These areas appear to constitute a motion pathway that originates in striate cortex and terminates in higher cortical areas of the parietal lobe (reviewed by Maun-sell and Newsome, 1987). The salient physiological feature of this pathway is an elevated percentage of directionally selective neurons at each level. The pathway begins in layer 4B of striate cortex, which is enriched in such cells relative to other striate laminae (Dow, 1974; Blasdel and Fitzpatrick, 1984; Livingstone and Hubel, 1984; Michael, 1985). Layer 4B projects, in turn, to the middle temporal visual area (MT), in which over 80 % of the neurons are direction-ally selective (e.g., Dubner and Zeki, 197 1; Maunsell and Van
Functional Significance Of Long-Term Potentiation For Sequence Learning And Prediction
- Cerebral Cortex
, 1994
"... Population coding, where neurons with broad and overlapping firing rate tuning curves collectively encode information about a stimulus, is a common feature of sensory systems.We use decoding methods and measured properties of NMDA-mediated LTP induction to study the impact of long-term potentiation ..."
Abstract
-
Cited by 33 (8 self)
- Add to MetaCart
Population coding, where neurons with broad and overlapping firing rate tuning curves collectively encode information about a stimulus, is a common feature of sensory systems.We use decoding methods and measured properties of NMDA-mediated LTP induction to study the impact of long-term potentiation of synapses between the neurons of such a coding array. We find that, due to a temporal asymmetry in the induction of NMDA-mediated LTP, firing patterns in a neuronal array that initially represent the current value of a sensory input will, after training, provide an experienced-based prediction of that input instead. We compute how this prediction arises from and depends on the training experience. We also show how the encoded prediction can be used to generate learned motor sequences, such as the movement of a limb. This involves a novel form of memory recall that is driven by the motor response so that it automatically generates new information at a rate appropriate for the task being per...
The Spatial Resolution of Visual Attention
- Cognitive Psychology
, 1997
"... Two tasks were used to evaluate the grain of visual attention, the minimum spacing at which attention can select individual items. First, observers performed a tracking task at many viewing distances. Performance dropped to chance levels at small display sizes even though, in all conditions, observe ..."
Abstract
-
Cited by 31 (7 self)
- Add to MetaCart
Two tasks were used to evaluate the grain of visual attention, the minimum spacing at which attention can select individual items. First, observers performed a tracking task at many viewing distances. Performance dropped to chance levels at small display sizes even though, in all conditions, observers could easily resolve the items and their motions. The limiting size for selection was roughly the same whether tracking one or three targets, suggesting that the resolution limit acts independently of the capacity limit of attention. Second, the closest spacing that still allowed individuation of single items in dense, static displays was examined. This critical spacing was about 50% coarser in the radial direction compared to the tangential direction, and was coarser in the upper as opposed to the lower visual field. The results suggest that no more than about 72 items can be arrayed in the central 30 degrees of the visual field while still allowing attentional access to each individuall...
A Model of Neuronal Responses in Visual Area MT
, 1997
"... Electrophysiological studies indicate that neurons in the Middle Temporal (MT) area of the primate brain are selective for the velocity of visual stimuli. This paper describes a computational model of MT physiology, in which local image velocities are represented via the distribution of MT neuronal ..."
Abstract
-
Cited by 27 (5 self)
- Add to MetaCart
Electrophysiological studies indicate that neurons in the Middle Temporal (MT) area of the primate brain are selective for the velocity of visual stimuli. This paper describes a computational model of MT physiology, in which local image velocities are represented via the distribution of MT neuronal responses. The computation is performed in two stages, corresponding to neurons in cortical areas V1 and MT. Each stage computes a weighted linear sum of inputs, followed by rectification and divisive normalization. V1 receptive field weights are designed for orientation and direction selectivity. MT receptive field weights are designed for velocity (both speed and direction) selectivity. The paper includes computational simulations accounting for a wide range of physiological data, and describes experiments that could be used to further test and refine the model.
Decoding Neuronal Firing And Modeling Neural Networks
- Quart. Rev. Biophys
, 1994
"... Introduction Biological neural networks are large systems of complex elements interacting through a complex array of connections. Individual neurons express a large number of active conductances (Connors et al., 1982; Adams & Gavin, 1986; Llin'as, 1988; McCormick, 1990; Hille, 1992) and exhibit a w ..."
Abstract
-
Cited by 17 (3 self)
- Add to MetaCart
Introduction Biological neural networks are large systems of complex elements interacting through a complex array of connections. Individual neurons express a large number of active conductances (Connors et al., 1982; Adams & Gavin, 1986; Llin'as, 1988; McCormick, 1990; Hille, 1992) and exhibit a wide variety of dynamic behaviors on time scales ranging from milliseconds to many minutes (Llin'as, 1988; Harris-Warrick & Marder, 1991; Churchland & Sejnowski, 1992; Turrigiano et al., 1994). Neurons in cortical circuits are typically coupled to thousands of other neurons (Stevens, 1989) and very little is known about the strengths of these synapses (although see Rosenmund et al., 1993; Hessler et al., 1993; Smetters & Nelson, 1993). The complex firing patterns of large neuronal populations are difficult to describe let alone understand. There is little point in accurately modeling each membrane potential in a large neural

