Results 1 - 10
of
74
FORMS: A Flexible Object Recognition and Modeling System
- International Journal of Computer Vision
, 1995
"... We describe a flexible object recognition and modeling system (FORMS) which represents and recognizes animate objects from their silhouettes. This consists of a model for generating the shapes of animate objects which gives a formalism for solving the inverse problem of object recognition. We model ..."
Abstract
-
Cited by 128 (9 self)
- Add to MetaCart
We describe a flexible object recognition and modeling system (FORMS) which represents and recognizes animate objects from their silhouettes. This consists of a model for generating the shapes of animate objects which gives a formalism for solving the inverse problem of object recognition. We model all objects at three levels of complexity: (i) the primitives, (ii) the mid-grained shapes, which are deformations of the primitives, and (iii) objects constructed by using a grammar to join mid-grained shapes together. The deformations of the primitives can be characterized by principal component analysis or modal analysis. When doing recognition the representations of these objects are obtained in a bottom-up manner from their silhouettes by a novel method for skeleton extraction and part segmentation based on deformable circles. These representations are then matched to a database of prototypical objects to obtain a set of candidate interpretations. These interpretations are verified in a...
Fast and Globally Convergent Pose Estimation From Video Images
, 1998
"... Determining the rigid transformation relating 2D images to known 3D geometry is a classical problem in photogrammetry and computer vision. Heretofore, the best methods for solving the problem have relied on iterative optimization methods which cannot be proven to converge and/or which do not effecti ..."
Abstract
-
Cited by 76 (3 self)
- Add to MetaCart
Determining the rigid transformation relating 2D images to known 3D geometry is a classical problem in photogrammetry and computer vision. Heretofore, the best methods for solving the problem have relied on iterative optimization methods which cannot be proven to converge and/or which do not effectively account for the orthonormal structure of rotation matrices. We show that the pose estimation problem can be formulated as that of minimizing an error metric based on collinearity in object (as opposed to image) space. Using object space collinearity error, we derive an iterative algorithm which directly computes orthogonal rotation matrices and which is globally convergent. Experimentally, we show that the method is computationally efficient, that it is no less accurate than the best currently employed optimization methods, and that it outperforms all tested methods in robustness to outliers. Chien-Ping Lu, Silicon Graphics Inc. cplu@engr.sgi.com y Greg Hager, Department of Computer...
A Generic Grouping Algorithm and its Quantitative Analysis
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1998
"... This paper presents a generic method for perceptual grouping, and an analysis of its expected grouping quality. The grouping method is fairly general: it may be used for the grouping of various types of data features, and to incorporate different grouping cues, operating over feature sets of diff ..."
Abstract
-
Cited by 51 (4 self)
- Add to MetaCart
This paper presents a generic method for perceptual grouping, and an analysis of its expected grouping quality. The grouping method is fairly general: it may be used for the grouping of various types of data features, and to incorporate different grouping cues, operating over feature sets of different sizes. The proposed method is divided into two parts: Constructing a graph representation of the available perceptual grouping evidence, and then finding the "best" partition of the graph into groups. The first stage includes a cue enhancement procedure, which integrates the information available from multi-feature cues into very reliable bi-feature cues. Both stages are implemented using known statistical tools such as Wald's SPRT algorithm and the Maximum Likelihood criterion. The accompanying theoretical analysis of this grouping criterion quantifies intuitive expectations and predicts that the expected grouping quality increases with cue reliability. It also shows that investing more computational effort in the grouping algorithm leads to better grouping results. This analysis, which quantifies the grouping power of the Maximum Likelihood criterion, is independent of the grouping domain. To our best knowledge, such an analysis of a grouping process is given here for the first time. Three grouping algorithms, in three different domains, are synthesized as instances of the generic method, They demonstrate the applicability and generality of this grouping method. Keywords : Perceptual Grouping, Grouping Analysis, Graph Clustering, Maximum Likelihood, Wald's SPRT, Performance Prediction, Generic Grouping Algorithm. 1
Calibration Requirements and Procedures for a Monitor-Based Augmented Reality System
- IEEE Transactions on Visualization and Computer Graphics
, 1995
"... Augmented realityentails the use of models and their associated renderings to supplement information in a real scene. In order for this information to be relevant or meaningful, the models must be positioned and displayed in suchaway that they blend into the real world in terms of alignments, per ..."
Abstract
-
Cited by 40 (8 self)
- Add to MetaCart
Augmented realityentails the use of models and their associated renderings to supplement information in a real scene. In order for this information to be relevant or meaningful, the models must be positioned and displayed in suchaway that they blend into the real world in terms of alignments, perspectives, illuminations, etc. For practical reasons the information necessary to obtain this realistic blending cannot be known a priori, and cannot be hard-wired into a system. Instead a number of calibration procedures are necessary so that the location and parameters of each of the system components are known. In this paper we identify the calibration steps necessary to build a computer model of the real world and then, using the monitor-based augmented reality system developed at ECRC#Grasp# as an example, we describe each of the calibration processes. These processes determine the internal parameters of our imaging devices #scan converter, frame grabber, and video camera#, as we...
A Perceptual Grouping Hierarchy for Appearance-Based 3D Object Recognition
- COMPUTER VISION AND IMAGE UNDERSTANDING
, 1999
"... In this report we consider the problem of 3D object recognition, and the role that perceptual grouping processes must play. In particular, we argue that a single level of perceptual grouping is inadequate, and that reliance on a single level of grouping is responsible for the specific weaknesses of ..."
Abstract
-
Cited by 40 (5 self)
- Add to MetaCart
In this report we consider the problem of 3D object recognition, and the role that perceptual grouping processes must play. In particular, we argue that a single level of perceptual grouping is inadequate, and that reliance on a single level of grouping is responsible for the specific weaknesses of several well-known recognition techniques. Instead, we argue that recognition must utilize a hierarchy of perceptual grouping processes, and describe an appearance-based system that uses four distinct levels of perceptual grouping, the upper two novel, to represent 3-D objects in a form that not only allows recognition, but reasoning about 3D manipulation of a sort that has been supported in the past only by 3D geometric models.
Three-dimensional shape knowledge for joint image segmentation and pose estimation
- Pattern Recognition, volume 3663 of LNCS
, 2005
"... In this article we present the integration of 3-D shape knowledge into a variational model for level set based image segmentation and tracking. Given a 3-D surface model of an object that is visible in the image of one or multiple cameras calibrated to the same world coordinate system, the object co ..."
Abstract
-
Cited by 27 (22 self)
- Add to MetaCart
In this article we present the integration of 3-D shape knowledge into a variational model for level set based image segmentation and tracking. Given a 3-D surface model of an object that is visible in the image of one or multiple cameras calibrated to the same world coordinate system, the object contour extracted by the segmentation method is applied to estimate the 3-D pose parameters of the object. Vice-versa, the surface model projected to the image plane helps in a top-down manner to improve the extraction of the contour. While common alternative segmentation approaches, which integrate 2-D shape knowledge, face the problem that an object can look very differently from various viewpoints, a 3-D free form model ensures that for each view the model can fit the data in the image very well. Moreover, one additionally solves the higher level problem of determining the object pose in 3-D space. Due to the variational formulation, the approach clearly states all model assumptions in a single energy functional that is locally minimized by our method. Its performance is demonstrated by experiments with a monocular and a stereo camera system. 1 1
Computer Vision-Based Registration Techniques for Augmented Reality
- Intelligent Robots and Computer Vision XV
, 1996
"... Augmented reality is a term used to describe systems in which computer-generated information is superimposed on top of the real world; for example, through the use of a see-through head-mounted display. A human user of such a system could still see and interact with the real world, but have valuable ..."
Abstract
-
Cited by 26 (3 self)
- Add to MetaCart
Augmented reality is a term used to describe systems in which computer-generated information is superimposed on top of the real world; for example, through the use of a see-through head-mounted display. A human user of such a system could still see and interact with the real world, but have valuable additional information, such as descriptions of important features or instructions for performing physical tasks, superimposed on the world. For example, the computer could identify objects and overlay them with graphic outlines, labels, and schematics. The graphics are registered to the real-world objects and appear to be painted onto those objects. Augmented reality systems can be used to make productivity aids for tasks such as inspection, manufacturing, and navigation. One of the most critical requirements for augmented reality is to recognize and locate real-world objects with respect to the persons head. Accurate registration is necessary in order to overlay graphics accurately on top of the real-world objects. At the Colorado School of Mines, we have developed a prototype augmented reality system that uses head-mounted cameras and computer vision techniques to accurately register the head to the scene. The current system locates and tracks a set of preplaced passive fiducial targets placed on the real-world objects. The system computes the pose of the objects and displays graphics overlays using a see-through head-mounted display. This paper describes the architecture of the system and outlines the computer vision techniques used. Keywords: augmented reality, registration, computer vision, pose estimation, fiducials, head-mounted displays 1.
Learning object representations from lighting variations
- in Object Representation in Computer Vision II
, 1996
"... Abstract. Realistic representation of objects requires models which can synthesize the image of an object under all possible viewing conditions. We propose to learn these models from examples. Methods for learning surface geometry and albedo from one or more images under fixed posed and varying ligh ..."
Abstract
-
Cited by 24 (2 self)
- Add to MetaCart
Abstract. Realistic representation of objects requires models which can synthesize the image of an object under all possible viewing conditions. We propose to learn these models from examples. Methods for learning surface geometry and albedo from one or more images under fixed posed and varying lighting conditions are described. Singular value decomposition (SVD) is used to determine shape, albedo, and lighting conditions up to an unknown 3×3 matrix, which is sufficient for recognition. The use of class-specific knowledge and the integrability constraint to determine this matrix is explored. We show that when the integrability constraint is applied to objects with varying albedo it leads to an ambiguity in depth estimation similar to the bas relief ambiguity. The integrability constraint, however, is useful for resolving ambiguities which arise in current photometric theories. Object Recognition Workshop. ECCV. 1996. 1
Pose estimation in conformal geometric algebra. Part II: Real-time pose estimation using extended feature concepts
- Journal of Mathematical Imaging and Vision
, 2005
"... Abstract. 2D-3D pose estimation means to estimate the relative position and orientation of a 3D object with respect to a reference camera system. This work has its main focus on the theoretical foundations of the 2D-3D pose estimation problem: We discuss the involved mathematical spaces and their in ..."
Abstract
-
Cited by 19 (15 self)
- Add to MetaCart
Abstract. 2D-3D pose estimation means to estimate the relative position and orientation of a 3D object with respect to a reference camera system. This work has its main focus on the theoretical foundations of the 2D-3D pose estimation problem: We discuss the involved mathematical spaces and their interaction within higher order entities. To cope with the pose problem (how to compare 2D projective image features with 3D Euclidean object features), the principle we propose is to reconstruct image features (e.g. points or lines) to one dimensional higher entities
Percentile Blobs for Image Similarity
- In Proceedings of the IEEE Workshop on Content-Based Access of Image and Video Libraries
, 1998
"... We present a new algorithm called PBSIM for computing image similarity, based upon a novel method of extracting bloblike features from images. In tests on a classification task using a data set of over 1000 images, PBSIM shows significantly higher accuracy than algorithms based upon color histograms ..."
Abstract
-
Cited by 18 (5 self)
- Add to MetaCart
We present a new algorithm called PBSIM for computing image similarity, based upon a novel method of extracting bloblike features from images. In tests on a classification task using a data set of over 1000 images, PBSIM shows significantly higher accuracy than algorithms based upon color histograms, as well as previously reported results for another approach based upon bloblike features. 1 Introduction As multimedia applications become more commonplace, the need increases for tools to manipulate large collections of visual data. One important tool that has already been the focus of significant research is an algorithmic process for determining the perceptual similarity of images. Such a tool can form the basis of many different image processing systems, including those for automatic classification of video or image data, retrieval of similar images from databases, and many other related and important tasks. For example, many image retrieval systems make use of a suite of similarity a...

