Results 1  10
of
79
Probabilistic Methods for Finding People
 INTERNATIONAL JOURNAL OF COMPUTER VISION
, 2001
"... Finding people in pictures presents a particularly difficult object recognition problem. We show how to find people by finding candidate body segments, and then constructing assemblies of segments that are consistent with the constraints on the appearance of a person that result from kinematic prope ..."
Abstract

Cited by 104 (2 self)
 Add to MetaCart
Finding people in pictures presents a particularly difficult object recognition problem. We show how to find people by finding candidate body segments, and then constructing assemblies of segments that are consistent with the constraints on the appearance of a person that result from kinematic properties. Since a reasonable model of a person requires at least nine segments, it is not possible to inspect every group, due to the huge combinatorial complexity. We propose two
Calibrationfree augmented reality
 IEEE Transactions on Visualization and Computer Graphics
, 1998
"... Abstract—Camera calibration and the acquisition of Euclidean 3D measurements have so far been considered necessary requirements for overlaying threedimensional graphical objects with live video. In this article, we describe a new approach to videobased augmented reality that avoids both requirement ..."
Abstract

Cited by 84 (0 self)
 Add to MetaCart
Abstract—Camera calibration and the acquisition of Euclidean 3D measurements have so far been considered necessary requirements for overlaying threedimensional graphical objects with live video. In this article, we describe a new approach to videobased augmented reality that avoids both requirements: It does not use any metric information about the calibration parameters of the camera or the 3D locations and dimensions of the environment’s objects. The only requirement is the ability to track across frames at least four fiducial points that are specified by the user during system initialization and whose world coordinates are unknown. Our approach is based on the following observation: Given a set of four or more noncoplanar 3D points, the projection of all points in the set can be computed as a linear combination of the projections of just four of the points. We exploit this observation by 1) tracking regions and color fiducial points at frame rate, and 2) representing virtual objects in a nonEuclidean, affine frame of reference that allows their projection to be computed as a linear combination of the projection of the fiducial points. Experimental results on two augmented reality systems, one monitorbased and one headmounted, demonstrate that the approach is readily implementable, imposes minimal computational and hardware requirements, and generates realtime and accurate video overlays even when the camera parameters vary dynamically. Index Terms—Augmented reality, realtime computer vision, calibration, registration, affine representations, feature tracking, 3D interaction techniques.
Parallel Algorithms for Hierarchical Clustering
 Parallel Computing
, 1995
"... Hierarchical clustering is a common method used to determine clusters of similar data points in multidimensional spaces. O(n 2 ) algorithms are known for this problem [3, 4, 10, 18]. This paper reviews important results for sequential algorithms and describes previous work on parallel algorithms f ..."
Abstract

Cited by 80 (1 self)
 Add to MetaCart
Hierarchical clustering is a common method used to determine clusters of similar data points in multidimensional spaces. O(n 2 ) algorithms are known for this problem [3, 4, 10, 18]. This paper reviews important results for sequential algorithms and describes previous work on parallel algorithms for hierarchical clustering. Parallel algorithms to perform hierarchical clustering using several distance metrics are then described. Optimal PRAM algorithms using n log n processors are given for the average link, complete link, centroid, median, and minimum variance metrics. Optimal butterfly and tree algorithms using n log n processors are given for the centroid, median, and minimum variance metrics. Optimal asymptotic speedups are achieved for the best practical algorithm to perform clustering using the single link metric on a n log n processor PRAM, butterfly, or tree. Keywords. Hierarchical clustering, pattern analysis, parallel algorithm, butterfly network, PRAM algorithm. 1 In...
3D Object modeling and recognition using local affineinvariant image descriptors and multiview spatial constraints
 International Journal of Computer Vision
, 2006
"... Abstract. This article introduces a novel representation for threedimensional (3D) objects in terms of local affineinvariant descriptors of their images and the spatial relationships between the corresponding surface patches. Geometric constraints associated with different views of the same patche ..."
Abstract

Cited by 75 (11 self)
 Add to MetaCart
Abstract. This article introduces a novel representation for threedimensional (3D) objects in terms of local affineinvariant descriptors of their images and the spatial relationships between the corresponding surface patches. Geometric constraints associated with different views of the same patches under affine projection are combined with a normalized representation of their appearance to guide matching and reconstruction, allowing the acquisition of true 3D affine and Euclidean models from multiple unregistered images, as well as their recognition in photographs taken from arbitrary viewpoints. The proposed approach does not require a separate segmentation stage, and it is applicable to highly cluttered scenes. Modeling and recognition results are presented.
Application of affineinvariant fourier descriptors to recognition of 3d objects
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1990
"... AbstractIn this work, the method of Fourier descriptors has been extended to produce a set of normalized coefficients which are invariant under any affine transformation (translation, rotation, scaling, and shearing). The method is based on a parameterized boundary description which is transforme ..."
Abstract

Cited by 74 (2 self)
 Add to MetaCart
AbstractIn this work, the method of Fourier descriptors has been extended to produce a set of normalized coefficients which are invariant under any affine transformation (translation, rotation, scaling, and shearing). The method is based on a parameterized boundary description which is transformed to the Fourier domain and normalized there to eliminate dependencies on the affine transformation and on the starting point. Invariance to affine transforms allows considerable robustness when applied to images of objects which rotate in all three dimensions. This is demonstrated by processing silhouettes of aircraft as the aircraft maneuver in threespace. Zndex TermsAffine transformation, features, Fourier descriptors, invariants, shape, 3D parameter estimation, 2D parameter determination. I. INTRODUCTION AND BACKGROUND
A selforganizing multipleview representation of 3D objects
, 1991
"... We explore representation of 3D objects in which several distinct 2D views are stored for each object. We demonstrate the ability of a twolayer network of thresholded summation units to support such representations. Using unsupervised Hebbian relaxation, the network learned to recognize ten objects ..."
Abstract

Cited by 68 (16 self)
 Add to MetaCart
We explore representation of 3D objects in which several distinct 2D views are stored for each object. We demonstrate the ability of a twolayer network of thresholded summation units to support such representations. Using unsupervised Hebbian relaxation, the network learned to recognize ten objects from different viewpoints. The training process led to the emergence of compact representations of the specific input views. When tested on novel views of the same objects, the network exhibited a substantial generalization capa bility. In simulated psychophysical experiments, the network's behavior was qualitatively similar to that of human subjects.
Determining The Gaze Of Faces In Images
 Image and Vision Computing
, 1994
"... This paper describes a more exible visionbased approach, which can estimate the direction of gaze from a single, monocular view of a face. The technique makes minimal assumptions about the structure of the face, requires very few image measurements, and produces an accurate estimate of the facial o ..."
Abstract

Cited by 62 (0 self)
 Add to MetaCart
This paper describes a more exible visionbased approach, which can estimate the direction of gaze from a single, monocular view of a face. The technique makes minimal assumptions about the structure of the face, requires very few image measurements, and produces an accurate estimate of the facial orientation, which is relatively insensitive to noise in the image and errors in the underlying assumptions. The computational requirements are insignicant, so with automatic tracking of a few facial features it is possible to produce gaze estimates at video rate. Gaze tracking, humancomputer interaction, weak perspective, symmetry, realtime feature tracking. ######## ######### Humans have little diculty sensing where another person is looking, often using this information to redeploy their own visual attention. Even prerenaissance artists were aware of this, using the gaze of characters a painting to draw the viewer's eye to some signicant part of the canvas. Yet this ability to determine a person's gaze, even from a single, monocular, uncalibrated view (as in paintings), is quite remarkable, especially considering the signicant intersubject variations in the facial features that provide the gaze cues. Current approaches to gaze tracking use active sensing to measure the orientation of the subject's eyes. The eye is illuminated with infrared light, and the gaze direction inferred from the relative position of the (the reection o the retina) and the from the cornea [10]. The system's calibration is sensitive to movements of the subject's head, so the subject must either remain perfectly still, or wear cumbersome headgear to maintain a constant separation between the sensor and the eye. A passive, visionbased approach would ideally tolerate large head movements, and...
Psychophysical support for a 2D view interpolation theory of object recognition
"... Does the human brain represent objects for recognition by storing a series of twodimensional snapshots, or are the object models, in some sense, threedimensional analogs of the objects they represent? One way to address this question is to explore the ability of the human visual system to generaliz ..."
Abstract

Cited by 56 (24 self)
 Add to MetaCart
Does the human brain represent objects for recognition by storing a series of twodimensional snapshots, or are the object models, in some sense, threedimensional analogs of the objects they represent? One way to address this question is to explore the ability of the human visual system to generalize recognition from familiar to novel views of threedimensional objects. Three recently proposed theories of object recognition  viewpoint normalization or alignment of 3D models [Ullman, S. (1989) Cognition, 32, 193254], linear combination of 2D views [Ullman, S. & Basri, R. (1990)], and view approximation [Poggio, T. & Edelman, S. (1990) Nature, 343, 263266]  predict different patterns of generalization to novel views. We have exploited the conflicting predictions to test the three theories directly, in a psychophysical experiment involving computergenerated 3D objects. Our results suggest that the human visual system is better described as recognizing these objects by 2D view in...
Planar Object Recognition using Projective Shape Representation
 International Journal of Computer Vision
, 1995
"... We describe a model based recognition system, called LEWIS, for the identification of planar objects based on a projectively invariant representation of shape. The advantages of this shape description include simple model acquisition (direct from images), no need for camera calibration or object pos ..."
Abstract

Cited by 51 (9 self)
 Add to MetaCart
We describe a model based recognition system, called LEWIS, for the identification of planar objects based on a projectively invariant representation of shape. The advantages of this shape description include simple model acquisition (direct from images), no need for camera calibration or object pose computation, and the use of index functions. We describe the feature construction and recognition algorithms in detail and provide an analysis of the combinatorial advantages of using index functions. Index functions are used to select models from a model base and are constructed from projective invariants based on algebraic curves and a canonical projective coordinate frame. Examples are given of object recognition from images of real scenes, with extensive object libraries. Successful recognition is demonstrated despite partial occlusion by unmodelled objects, and realistic lighting conditions. 1 Introduction 1.1 Overview In the context of this paper, recognition is defined as the prob...