Results 11 - 20
of
397
View-dependent object recognition by monkeys
- Current Biology
, 1994
"... How does the brain recognize three-dimensional objects? An initial step towards the understanding of the neural substrate of visual object recognition can be taken by studying first the nature of object representation, as manifested in behavioral studies with humans or non-human primates. One fund ..."
Abstract
-
Cited by 75 (10 self)
- Add to MetaCart
How does the brain recognize three-dimensional objects? An initial step towards the understanding of the neural substrate of visual object recognition can be taken by studying first the nature of object representation, as manifested in behavioral studies with humans or non-human primates. One fundamental question is whether these representations are object or viewer centered. We trained monkeys to recognize computer rendered objects presented from an arbitrarily chosen training view, and subsequently tested their abilityto generalize recognition for views generated by mathematically rotating the objects around any arbitrary axis.
The Role of the Primary Visual Cortex in Higher Level Vision
, 1998
"... In the classical feed-forward, modular view of visual processing, the primary visual cortex (area V1) is a module that serves to extract local features such as edges and bars. Representation and recognition of objects are thought to be functions of higher extrastriate cortical areas. This paper pres ..."
Abstract
-
Cited by 67 (3 self)
- Add to MetaCart
In the classical feed-forward, modular view of visual processing, the primary visual cortex (area V1) is a module that serves to extract local features such as edges and bars. Representation and recognition of objects are thought to be functions of higher extrastriate cortical areas. This paper presents neurophysiological data that show the later part of V1 neurons' responses reflecting higher order perceptual computations related to Ullman's (Cognition 1984;18:97 -- 159) visual routines and Marr's (Vision NJ: Freeman 1982) full primal sketch, 2 1 2 D sketch and 3D model. Based on theoretical reasoning and the experimental evidence, we propose a possible reinterpretation of the functional role of V1. In this framework, because of V1 neurons' precise encoding of orientation and spatial information, higher level perceptual computations and representations that involve high resolution details, fine geometry and spatial precision would necessarily involve V1 and be reflected in the later...
Shape contexts enable efficient retrieval of similar shapes
- In CVPR
, 2001
"... mori,malik£ In this work we demonstrate that a recently introduced shape descriptor, the “shape context”, can be used to quickly prune a search for similar shapes. Our representation for a shape is a discrete set of ¤ points sampled from its internal and external contours. For each of these points, ..."
Abstract
-
Cited by 64 (12 self)
- Add to MetaCart
mori,malik£ In this work we demonstrate that a recently introduced shape descriptor, the “shape context”, can be used to quickly prune a search for similar shapes. Our representation for a shape is a discrete set of ¤ points sampled from its internal and external contours. For each of these points, the shape context is a histogram of the relative positions of the ¤¦¥¨ § remaining points. We present two methods for rapid shape retrieval: one that does comparisons based on a small number of shape contexts and another that uses vector quantization in the space of shape contexts. We verify the discriminative power of these methods with tests on the Columbia (COIL-100) 3D object database and the Snodgrass and Vanderwart line drawings. The shape contextbased methods are shown to quickly produce an accurate shortlist of candidates suitable for a more exact matching engine in spite of pose variation and occlusion. 1
Image segmentation based on oscillatory correlation
- Neural Computation
, 1997
"... We study image segmentation on the basis of locally excitatory globally inhibitory oscillator networks (LEGION), whereby the phases of oscillators encode the binding of pixels. We introduce a potential for each oscillator so that only those oscillators with strong connections from their neighborhood ..."
Abstract
-
Cited by 63 (18 self)
- Add to MetaCart
We study image segmentation on the basis of locally excitatory globally inhibitory oscillator networks (LEGION), whereby the phases of oscillators encode the binding of pixels. We introduce a potential for each oscillator so that only those oscillators with strong connections from their neighborhood can develop high potentials. Based on the concept of potential, a solution to remove noisy regions in an image is proposed for LEGION, so that it suppresses the oscillators corresponding to noisy regions, without affecting those corresponding to major regions. We show analytically that the resulting oscillator network separates an image into several major regions, plus a background consisting of all noisy regions, and illustrate network properties by computer simulation. The network exhibits a natural capacity in segmenting images. The oscillatory dynamics leads to a computer algorithm, which is applied successfully to segmenting real graylevel images. A number of issues regarding biological plausibility and perceptual organization are discussed. We argue that LEGION provides a novel and effective framework for image segmentation and figure-ground segregation. DeLiang Wang and David Terman Image Segmentation 1.
SUSTAIN: A network model of category learning
- Psychological Review
, 2004
"... SUSTAIN (Supervised and Unsupervised STratified Adaptive Incremental Network) is a model of how humans learn categories from examples. SUS-TAIN initially assumes a simple category structure. If simple solutions prove inadequate and SUSTAIN is confronted with a surprising event (e.g., it is told that ..."
Abstract
-
Cited by 60 (10 self)
- Add to MetaCart
SUSTAIN (Supervised and Unsupervised STratified Adaptive Incremental Network) is a model of how humans learn categories from examples. SUS-TAIN initially assumes a simple category structure. If simple solutions prove inadequate and SUSTAIN is confronted with a surprising event (e.g., it is told that a bat is a mammal instead of a bird), SUSTAIN recruits an additional cluster to represent the surprising event. Newly recruited clusters are available to explain future events and can themselves evolve into
Determining generative models of objects under varying illumination: Shape and albedo from multiple images using svd and integrability
- International Journal of Computer Vision
, 1999
"... We describe a method of learning generative models of objects from a set of images of the object under different, and unknown, illumination. Such a model allows us to approximate the objects’ appearance under a range of lighting conditions. This work is closely related to photometric stereo with unk ..."
Abstract
-
Cited by 56 (1 self)
- Add to MetaCart
We describe a method of learning generative models of objects from a set of images of the object under different, and unknown, illumination. Such a model allows us to approximate the objects’ appearance under a range of lighting conditions. This work is closely related to photometric stereo with unknown light sources and, in particular, to the use of Singular Value Decomposition (SVD) to estimate shape and albedo from multiple images up to a linear transformation [15]. Firstly we analyze and extend the SVD approach to this problem. We demonstrate that it applies to objects for which the dominant imaging effects are Lambertian reflectance with a distant light source and a background ambient term. To determine that this is a reasonable approximation we calculate the eigenvectors of the SVD on a set of real objects, under varying lighting conditions, and demonstrate that the first few eigenvectors account for most of the data in agreement with our predictions. We then analyze the linear ambiguities in the SVD approach and demonstrate that previous methods proposed to resolve them [15] are only valid under certain conditions. We discuss alternative possibilities and, in particular, demonstrate that knowledge of the object class is sufficient to resolve this problem. Secondly, we describe the use of surface consistency for putting constraints on the possible solutions. We prove that this constraint reduces the ambiguities to a subspace called the generalized bas relief ambiguity (GBR) which is inherent in the Lambertian reflectance function (and which can be shown to exist even if attached and cast shadows are present [3]). We demonstrate the use of surface consistency to solve for the shape and albedo up to a GBR and describe, and implement, a variety of additional assumptions to resolve the GBR. Thirdly, we demonstrate an iterative algorithm that can detect and remove some attached shadows from the objects thereby increasing the accuracy of the reconstructed shape and albedo. 1
Extracting Buildings from Aerial Images using Hierarchical Aggregation in 2D and 3D
, 1998
"... We propose a model-based approach to automated 3D extraction of buildings from aerial images. We focus on a reconstruction strategy that is not restricted to a small class of buildings. Therefore, we employ a generic modeling approach which relies on the well dened combination of building part mo ..."
Abstract
-
Cited by 55 (4 self)
- Add to MetaCart
We propose a model-based approach to automated 3D extraction of buildings from aerial images. We focus on a reconstruction strategy that is not restricted to a small class of buildings. Therefore, we employ a generic modeling approach which relies on the well dened combination of building part models. Building parts are classied by their roof type.
Invariant Object Recognition in the Visual System with Novel Views of 3D Objects
, 2002
"... ... In this article, we show how trace learning could solve the problem of in-depth rotation-invariant object recognition by developing representations of the transforms that features undergo when they are on the surfaces of 3D objects. Moreover, we show that having learned how features on 3D object ..."
Abstract
-
Cited by 50 (11 self)
- Add to MetaCart
... In this article, we show how trace learning could solve the problem of in-depth rotation-invariant object recognition by developing representations of the transforms that features undergo when they are on the surfaces of 3D objects. Moreover, we show that having learned how features on 3D objects transform geometrically as the object is rotated in depth, the network can correctly recognize novel 3D variations within a generic view of an object composed of a new combination of previously learned features. These results are demonstrated in simulations of a hierarchical network model (VisNet) of the visual system that show that it can develop representations useful for the recognition of 3D objects by forming perspective-invariant representations to allow generalization within a generic view.
Shape classification using the inner-distance
- IEEE Trans. Pattern Anal. Mach. Intell
, 2007
"... Part structure and articulation are of fundamental importance in computer and human vision. We propose using the inner-distance to build shape descriptors that are robust to articulation and capture part structure. The inner-distance is defined as the length of the shortest path between landmark poi ..."
Abstract
-
Cited by 50 (5 self)
- Add to MetaCart
Part structure and articulation are of fundamental importance in computer and human vision. We propose using the inner-distance to build shape descriptors that are robust to articulation and capture part structure. The inner-distance is defined as the length of the shortest path between landmark points within the shape silhouette. We show that it is articulation insensitive and more effective at capturing part structures than the Euclidean distance. This suggests that the inner-distance can be used as a replacement for the Euclidean distance to build more accurate descriptors for complex shapes, especially for those with articulated parts. In addition, texture information along the shortest path can be used to further improve shape classification. With this idea, we propose three approaches to using the inner-distance. The first method combines the inner-distance and multidimensional scaling (MDS) to build articulation invariant signatures for articulated shapes. The second method uses the inner-distance to build a new shape descriptor based on shape contexts. The third one extends the second one by considering the texture information along shortest paths. The proposed approaches have been tested on a variety of shape databases including an articulated shape dataset, MPEG7 CE-Shape-1, Kimia silhouettes, the ETH-80 data set, two leaf data sets, and a human motion silhouette dataset. In all the experiments, our methods demonstrate effective performance compared with other algorithms.
Visual interpretation of known objects in constrained scenes
- Phil. Trans. R. Soc. Lond. B
, 1992
"... Recent work on the visual interpretation of traffic scenes is described which relies heavily on a priori knowledge of the scene and position of the camera, and expectations about the shapes of vehicles and their likely movements in the scene. Knowledge is represented in the computer as explicit 3D ..."
Abstract
-
Cited by 47 (6 self)
- Add to MetaCart
Recent work on the visual interpretation of traffic scenes is described which relies heavily on a priori knowledge of the scene and position of the camera, and expectations about the shapes of vehicles and their likely movements in the scene. Knowledge is represented in the computer as explicit 3D geometrical models, dynamic filters, and descriptions of behaviour. Model-based vision, based on reasoning with analog models, avoids many of the classical problems in visual perception: recognition is robust against changes in the image of shape, size, colour and illumination. The 3D understanding of the scene which results also deals naturally with occlusion, and allows the behaviour of vehicles to be interpreted. The experiments with machine vision raise questions about the part played by perceptual context for object recognition in natural vision, and the neural mechanisms which might serve such a role. Vision in constrained scenes - 2 - GDS 8/3/92 1. INTRODUCTION High-level vision i...

