Results 1 - 10
of
166
Hierarchical Models of Object Recognition in Cortex
, 1999
"... The classical model of visual processing in cortex is a hierarchy of increasingly sophisticated representations, extending in a natural way the model of simple to complex cells of Hubel and Wiesel. Somewhat surprisingly, little quantitative modeling has been done in the last 15 years to explore th ..."
Abstract
-
Cited by 344 (67 self)
- Add to MetaCart
The classical model of visual processing in cortex is a hierarchy of increasingly sophisticated representations, extending in a natural way the model of simple to complex cells of Hubel and Wiesel. Somewhat surprisingly, little quantitative modeling has been done in the last 15 years to explore the biological feasibility of this class of models to explain higher level visual processing, such as object recognition. We describe a new hierarchical model that accounts well for this complex visual task, is consistent with several recent physiological experiments in inferotemporal cortex and makes testable predictions. The model is based on a novel MAX-like operation on the inputs to certain cortical neurons which may have a general role in cortical function.
Learning Invariance From Transformation Sequences
, 1991
"... Introduction How can we consistently recognize objects when changes in the viewing angle, eye position, distance, size, orientation, relative position, or deformations of the object itself (e.g., of a newspaper or a gymnast) can change their retinal projections so significantly? The visual system m ..."
Abstract
-
Cited by 179 (2 self)
- Add to MetaCart
Introduction How can we consistently recognize objects when changes in the viewing angle, eye position, distance, size, orientation, relative position, or deformations of the object itself (e.g., of a newspaper or a gymnast) can change their retinal projections so significantly? The visual system must contain knowledge about such transformations in order to be able to generalize correctly. Part of this knowledge is probably determined genetically, but it is also likely that the visual system learns from its sensory experience, which contains plenty of examples of such transformations. Electrophysiological experiments suggest that the invariance properties of perception may be due to the receptive field characteristics of individual cells in the visual system. Complex cells in the primary visual cortex exhibit approximate invariance to position within a limited range (Hubel and Wiesel 1962), while cells in higher visual areas in the temporal cortex show more complex forms of invariance
Object recognition with features inspired by visual cortex
- CVPR’05 -Volume
, 2005
"... We introduce a novel set of features for robust object recognition. Each element of this set is a complex feature obtained by combining position- and scale-tolerant edgedetectors over neighboring positions and multiple orientations. Our system’s architecture is motivated by a quantitative model of v ..."
Abstract
-
Cited by 133 (12 self)
- Add to MetaCart
We introduce a novel set of features for robust object recognition. Each element of this set is a complex feature obtained by combining position- and scale-tolerant edgedetectors over neighboring positions and multiple orientations. Our system’s architecture is motivated by a quantitative model of visual cortex. We show that our approach exhibits excellent recognition performance and outperforms several state-of-the-art systems on a variety of image datasets including many different object categories. We also demonstrate that our system is able to learn from very few examples. The performance of the approach constitutes a suggestive plausibility proof for a class of feedforward models of object recognition in cortex. 1
Robust object recognition with cortex-like mechanisms
- IEEE Trans. Pattern Analysis and Machine Intelligence
, 2007
"... Abstract—We introduce a new general framework for the recognition of complex visual scenes, which is motivated by biology: We describe a hierarchical system that closely follows the organization of visual cortex and builds an increasingly complex and invariant feature representation by alternating b ..."
Abstract
-
Cited by 118 (20 self)
- Add to MetaCart
Abstract—We introduce a new general framework for the recognition of complex visual scenes, which is motivated by biology: We describe a hierarchical system that closely follows the organization of visual cortex and builds an increasingly complex and invariant feature representation by alternating between a template matching and a maximum pooling operation. We demonstrate the strength of the approach on a range of recognition tasks: From invariant single object recognition in clutter to multiclass categorization problems and complex scene understanding tasks that rely on the recognition of both shape-based as well as texture-based objects. Given the biological constraints that the system had to satisfy, the approach performs surprisingly well: It has the capability of learning from only a few training examples and competes with state-of-the-art systems. We also discuss the existence of a universal, redundant dictionary of features that could handle the recognition of most object categories. In addition to its relevance for computer vision, the success of this approach suggests a plausibility proof for a class of feedforward models of object recognition in cortex.
A biologically inspired system for action recognition
- In ICCV
, 2007
"... We present a biologically-motivated system for the recognition of actions from video sequences. The approach builds on recent work on object recognition based on hierarchical feedforward architectures [25, 16, 20] and extends a neurobiological model of motion processing in the visual cortex [10]. Th ..."
Abstract
-
Cited by 71 (4 self)
- Add to MetaCart
We present a biologically-motivated system for the recognition of actions from video sequences. The approach builds on recent work on object recognition based on hierarchical feedforward architectures [25, 16, 20] and extends a neurobiological model of motion processing in the visual cortex [10]. The system consists of a hierarchy of spatio-temporal feature detectors of increasing complexity: an input sequence is first analyzed by an array of motiondirection sensitive units which, through a hierarchy of processing stages, lead to position-invariant spatio-temporal feature detectors. We experiment with different types of motion-direction sensitive units as well as different system architectures. As in [16], we find that sparse features in intermediate stages outperform dense ones and that using a simple feature selection approach leads to an efficient system that performs better with far fewer features. We test the approach on different publicly available action datasets, in all cases achieving the highest results reported to date. 1.
Transfer of Coded Information from Sensory to Motor Networks
, 1995
"... During sensory-guided motor tasks, information must be transferred from arrays of neurons coding target location to motor networks that generate and control movement. We address two basic questions about this information transfer. First, what mechanisms assure that the different neural representatio ..."
Abstract
-
Cited by 57 (11 self)
- Add to MetaCart
During sensory-guided motor tasks, information must be transferred from arrays of neurons coding target location to motor networks that generate and control movement. We address two basic questions about this information transfer. First, what mechanisms assure that the different neural representations align properly so that activity in the sensory network representing target location evokes a motor response generating accurate movement toward the target? Coordinate transformations may be needed to put the sensory data into a form appropriate for use by the motor system. For example, in visually guided reaching the location of a target relative to the body is determined by a combination of the position of its image on the retina and the direction of gaze. What assures that the motor network responds to the appropriate combination of sensory inputs corresponding to target position in body- or arm-centered coordinates ? To answer these questions, we model a sensory network coding target p...
Learning Optimized Features for Hierarchical Models of Invariant Object Recognition
, 2002
"... There is an ongoing debate over the capabilities of hierarchical neural feed-forward architectures for performing real-world invariant object recognition. Although a variety of hierarchical models exists, appropriate supervised and unsupervised learning methods are still an issue of intense rese ..."
Abstract
-
Cited by 56 (28 self)
- Add to MetaCart
There is an ongoing debate over the capabilities of hierarchical neural feed-forward architectures for performing real-world invariant object recognition. Although a variety of hierarchical models exists, appropriate supervised and unsupervised learning methods are still an issue of intense research. We propose a feedforward model for recognition that shares components like weightsharing, pooling stages, and competitive nonlinearities with earlier approaches, but focus on new methods for learning optimal featuredetecting cells in intermediate stages of the hierarchical network.
Learning and Problem Solving with Multilayer Connectionist Systems
, 1986
"... Learning and Problem Solving with Multilayer Connectionist Systems September 1986 Charles William Anderson B.S., University of Nebraska M.S., University of Massachusetts Ph.D., University of Massachusetts Directed by: Professor Andrew G. Barto The di#culties of learning in multilayered netwo ..."
Abstract
-
Cited by 49 (1 self)
- Add to MetaCart
Learning and Problem Solving with Multilayer Connectionist Systems September 1986 Charles William Anderson B.S., University of Nebraska M.S., University of Massachusetts Ph.D., University of Massachusetts Directed by: Professor Andrew G. Barto The di#culties of learning in multilayered networks of computational units has limited the use of connectionist systems in complex domains. This dissertation elucidates the issues of learning in a network's hidden units, and reviews methods for addressing these issues that have been developed through the years. Issues of learning in hidden units are shown to be analogous to learning issues for multilayer systems employing symbolic representations.
Hyperfeatures - multilevel local coding for visual recognition
- In ECCV
, 2006
"... Abstract. Histograms of local appearance descriptors are a popular representation for visual recognition. They are highly discriminant and have good resistance to local occlusions and to geometric and photometric variations, but they are not able to exploit spatial co-occurrence statistics at scales ..."
Abstract
-
Cited by 42 (1 self)
- Add to MetaCart
Abstract. Histograms of local appearance descriptors are a popular representation for visual recognition. They are highly discriminant and have good resistance to local occlusions and to geometric and photometric variations, but they are not able to exploit spatial co-occurrence statistics at scales larger than their local input patches. We present a new multilevel visual representation, ‘hyperfeatures’, that is designed to remedy this. The starting point is the familiar notion that to detect object parts, in practice it often suffices to detect co-occurrences of more local object fragments – a process that can be formalized as comparison (e.g. vector quantization) of image patches against a codebook of known fragments, followed by local aggregation of the resulting codebook membership vectors to detect cooccurrences. This process converts local collections of image descriptor vectors into somewhat less local histogram vectors – higher-level but spatially coarser descriptors. We observe that as the output is again a local descriptor vector, the process can be iterated, and that doing so captures and codes ever larger assemblies of object parts and increasingly abstract or ‘semantic ’ image properties. We formulate the hyperfeatures model and study its performance under several different image coding methods including clustering based Vector Quantization, Gaussian Mixtures, and combinations of these with Latent Dirichlet Allocation. We find that the resulting high-level features provide improved performance in several object image and texture image classification tasks. 1
Are cortical models really bound by the “Binding Problem
- Neuron
, 1999
"... Address correspondence to T.P. The usual description of visual processing in cortex is an extension of the simple to complex hi-erarchy postulated by Hubel and Wiesel — a feedforward sequence of more and more complex and invariant features. The capability of this class of models to perform higher le ..."
Abstract
-
Cited by 41 (16 self)
- Add to MetaCart
Address correspondence to T.P. The usual description of visual processing in cortex is an extension of the simple to complex hi-erarchy postulated by Hubel and Wiesel — a feedforward sequence of more and more complex and invariant features. The capability of this class of models to perform higher level visual processing such as viewpoint-invariant object recognition in cluttered scenes has been questioned in recent years by several researchers, who in turn proposed an alternative class of models based on the synchro-nization of large assemblies of cells, within and across cortical areas. The main implicit argument for this novel and controversial view was the assumption that hierarchical models cannot deal with the computational requirements of high level vision and suffer from the so-called “binding problem”. We review the present situation and discuss theoretical and experimental evidence showing that the perceived weaknesses of hierarchical models are not true. In particular, we show that recognition of multiple objects in cluttered scenes, arguably among the most difficult tasks in vision, can be done in a hierarchical feedforward model. 1

