Results 11  20
of
330
ModelBased Object Pose in 25 Lines of Code
 International Journal of Computer Vision
, 1995
"... In this paper, we describe a method for finding the pose of an object from a single image. We assume that we can detect and match in the image four or more noncoplanar feature points of the object, and that we know their relative geometry on the object. The method combines two algorithms ..."
Abstract

Cited by 198 (4 self)
 Add to MetaCart
In this paper, we describe a method for finding the pose of an object from a single image. We assume that we can detect and match in the image four or more noncoplanar feature points of the object, and that we know their relative geometry on the object. The method combines two algorithms
A Theory of Networks for Approximation and Learning
 Laboratory, Massachusetts Institute of Technology
, 1989
"... Learning an inputoutput mapping from a set of examples, of the type that many neural networks have been constructed to perform, can be regarded as synthesizing an approximation of a multidimensional function, that is solving the problem of hypersurface reconstruction. From this point of view, t ..."
Abstract

Cited by 194 (24 self)
 Add to MetaCart
Learning an inputoutput mapping from a set of examples, of the type that many neural networks have been constructed to perform, can be regarded as synthesizing an approximation of a multidimensional function, that is solving the problem of hypersurface reconstruction. From this point of view, this form of learning is closely related to classical approximation techniques, such as generalized splines and regularization theory. This paper considers the problems of an exact representation and, in more detail, of the approximation of linear and nonlinear mappings in terms of simpler functions of fewer variables. Kolmogorov's theorem concerning the representation of functions of several variables in terms of functions of one variable turns out to be almost irrelevant in the context of networks for learning. Wedevelop a theoretical framework for approximation based on regularization techniques that leads to a class of threelayer networks that we call Generalized Radial Basis Functions (GRBF), since they are mathematically related to the wellknown Radial Basis Functions, mainly used for strict interpolation tasks. GRBF networks are not only equivalent to generalized splines, but are also closely related to pattern recognition methods suchasParzen windows and potential functions and to several neural network algorithms, suchas Kanerva's associative memory,backpropagation and Kohonen's topology preserving map. They also haveaninteresting interpretation in terms of prototypes that are synthesized and optimally combined during the learning stage. The paper introduces several extensions and applications of the technique and discusses intriguing analogies with neurobiological data.
Modal Matching for Correspondence and Recognition
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1995
"... Modal matching is a new method for establishing correspondences and computing canonical descriptions. The method is based on the idea of describing objects in terms of generalized symmetries, as defined by each object's eigenmodes. The resulting modal description is used for object recognition and c ..."
Abstract

Cited by 181 (6 self)
 Add to MetaCart
Modal matching is a new method for establishing correspondences and computing canonical descriptions. The method is based on the idea of describing objects in terms of generalized symmetries, as defined by each object's eigenmodes. The resulting modal description is used for object recognition and categorization, where shape similarities are expressed as the amounts of modal deformation energy needed to align the two objects. In general, modes provide a globaltolocal ordering of shape deformation and thus allow for selecting which types of deformations are used in object alignment and comparison. In contrast to previous techniques, which required correspondence to be computed with an initial or prototype shape, modal matching utilizes a new type of finite element formulation that allows for an object's eigenmodes to be computed directly from available image information. This improved formulation provides greater generality and accuracy, and is applicable to data of any dimensionality. Correspondence results with 2D contour and point feature data are shown, and recognition experiments with 2D images of hand tools and airplanes are described.
A probabilistic approach to object recognition using local photometry and global geometry
 European Conference on Computer Vision
, 1998
"... Abstract. Many object classes, including human faces, can be modeled as a set of characteristic parts arranged in a variable spatial con guration. We introduce a simpli ed model of a deformable object class and derive the optimal detector for this model. However, the optimal detector is not realizab ..."
Abstract

Cited by 148 (13 self)
 Add to MetaCart
Abstract. Many object classes, including human faces, can be modeled as a set of characteristic parts arranged in a variable spatial con guration. We introduce a simpli ed model of a deformable object class and derive the optimal detector for this model. However, the optimal detector is not realizable except under special circumstances (independent part positions). A cousin of the optimal detector is developed which uses \soft " part detectors with a probabilistic description of the spatial arrangement of the parts. Spatial arrangements are modeled probabilistically using shape statistics to achieve invariance to translation, rotation, and scaling. Improved recognition performance over methods based on \hard " part detectors is demonstrated for the problem of face detection in cluttered scenes. 1
Algebraic Functions For Recognition
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1994
"... In the general case, a trilinear relationship between three perspective views is shown to exist. The trilinearity result is shown to be of much practical use in visual recognition by alignment  yielding a direct reprojection method that cuts through the computations of camera transformation, sce ..."
Abstract

Cited by 147 (29 self)
 Add to MetaCart
In the general case, a trilinear relationship between three perspective views is shown to exist. The trilinearity result is shown to be of much practical use in visual recognition by alignment  yielding a direct reprojection method that cuts through the computations of camera transformation, scene structure and epipolar geometry. Moreover, the direct method is linear and sets a new lower theoretical bound on the minimal number of points that are required for a linear solution for the task of reprojection. The proof of the central result may be of further interest as it demonstrates certain regularities across homographies of the plane and introduces new view invariants. Experiments on simulated and real image data were conducted, including a comparative analysis with epipolar intersection and the linear combination methods, with results indicating a greater degree of robustness in practice and a higher level of performance in reprojection tasks. Keywords Visual Recognition, Al...
FORMS: A Flexible Object Recognition and Modeling System
 International Journal of Computer Vision
, 1995
"... We describe a flexible object recognition and modeling system (FORMS) which represents and recognizes animate objects from their silhouettes. This consists of a model for generating the shapes of animate objects which gives a formalism for solving the inverse problem of object recognition. We model ..."
Abstract

Cited by 145 (10 self)
 Add to MetaCart
We describe a flexible object recognition and modeling system (FORMS) which represents and recognizes animate objects from their silhouettes. This consists of a model for generating the shapes of animate objects which gives a formalism for solving the inverse problem of object recognition. We model all objects at three levels of complexity: (i) the primitives, (ii) the midgrained shapes, which are deformations of the primitives, and (iii) objects constructed by using a grammar to join midgrained shapes together. The deformations of the primitives can be characterized by principal component analysis or modal analysis. When doing recognition the representations of these objects are obtained in a bottomup manner from their silhouettes by a novel method for skeleton extraction and part segmentation based on deformable circles. These representations are then matched to a database of prototypical objects to obtain a set of candidate interpretations. These interpretations are verified in a...
Finding Naked People
, 1996
"... . This paper demonstrates a contentbased retrieval strategy that can tell whether there are naked people present in an image. No manual intervention is required. The approach combines color and texture properties to obtain an effective mask for skin regions. The skin mask is shown to be effective f ..."
Abstract

Cited by 144 (7 self)
 Add to MetaCart
. This paper demonstrates a contentbased retrieval strategy that can tell whether there are naked people present in an image. No manual intervention is required. The approach combines color and texture properties to obtain an effective mask for skin regions. The skin mask is shown to be effective for a wide range of shades and colors of skin. These skin regions are then fed to a specialized grouper, which attempts to group a human figure using geometric constraints on human structure. This approach introduces a new view of object recognition, where an object model is an organized collection of grouping hints obtained from a combination of constraints on geometric properties such as the structure of individual parts, and the relationships between parts, and constraints on color and texture. The system is demonstrated to have 60% precision and 52% recall on a test set of 138 uncontrolled images of naked people, mostly obtained from the internet, and 1401 assorted control images, drawn f...
An Eigenspace Update Algorithm for Image Analysis
, 1997
"... this paper However, the vision research community has largely overlooked makes the following contributions: parallel developments in signal processing and numerical linear algebra concerning efficient eigenspace updating algorithms. . We provide a comparison of some of the popular tech These new ..."
Abstract

Cited by 114 (3 self)
 Add to MetaCart
this paper However, the vision research community has largely overlooked makes the following contributions: parallel developments in signal processing and numerical linear algebra concerning efficient eigenspace updating algorithms. . We provide a comparison of some of the popular tech These new developments are significant for two reasons: Adopt niques existing in the vision literature for SVD/KLT com ing them will make some of the current vision algorithms more putations and point out the problems associated with robust and efficient. More important is the fact that incremental those techniques
On Photometric Issues in 3D Visual Recognition From A Single 2D Image
 International Journal of Computer Vision
, 1997
"... . We describe the problem of recognition under changing illumination conditions and changing viewing positions from a computational and human vision perspective. On the computational side we focus on the mathematical problems of creating an equivalence class for images of the same 3D object undergo ..."
Abstract

Cited by 108 (6 self)
 Add to MetaCart
. We describe the problem of recognition under changing illumination conditions and changing viewing positions from a computational and human vision perspective. On the computational side we focus on the mathematical problems of creating an equivalence class for images of the same 3D object undergoing certain groups of transformations  mostly those due to changing illumination, and briefly discuss those due to changing viewing positions. The computational treatment culminates in proposing a simple scheme for recognizing, via alignment, an image of a familiar object taken from a novel viewing position and a novel illumination condition. On the human vision aspect, the paper is motivated by empirical evidence inspired by Mooney images of faces that suggest a relatively high level of visual processing is involved in compensating for photometric sources of variability, and furthermore, that certain limitations on the admissible representations of image information may exist. The psycho...