Results 1 - 10
of
1,378
CONDENSATION - conditional density propagation for visual tracking
- International Journal of Computer Vision
, 1998
"... The problem of tracking curves in dense visual clutter is challenging. Kalman filtering is inadequate because it is based on Gaussian densities which, being unimodal, cannot represent simultaneous alternative hypotheses. The Condensation algorithm uses "factored sampling", previously applied to the ..."
Abstract
-
Cited by 911 (12 self)
- Add to MetaCart
The problem of tracking curves in dense visual clutter is challenging. Kalman filtering is inadequate because it is based on Gaussian densities which, being unimodal, cannot represent simultaneous alternative hypotheses. The Condensation algorithm uses "factored sampling", previously applied to the interpretation of static images, in which the probability distribution of possible interpretations is represented by a randomly generated set. Condensation uses learned dynamical models, together with visual observations, to propagate the random set over time. The result is highly robust tracking of agile motion. Notwithstanding the use of stochastic methods, the algorithm runs in near real-time. Contents 1 Tracking curves in clutter 2 2 Discrete-time propagation of state density 3 3 Factored sampling 6 4 The Condensation algorithm 8 5 Stochastic dynamical models for curve motion 10 6 Observation model 13 7 Applying the Condensation algorithm to video-streams 17 8 Conclusions 26 A Non-line...
A Tutorial on Visual Servo Control
- IEEE Transactions on Robotics and Automation
, 1996
"... This paper provides a tutorial introduction to visual servo control of robotic manipulators. Since the topic spans many disciplines our goal is limited to providing a basic conceptual framework. We begin by reviewing the prerequisite topics from robotics and computer vision, including a brief review ..."
Abstract
-
Cited by 513 (17 self)
- Add to MetaCart
This paper provides a tutorial introduction to visual servo control of robotic manipulators. Since the topic spans many disciplines our goal is limited to providing a basic conceptual framework. We begin by reviewing the prerequisite topics from robotics and computer vision, including a brief review of coordinate transformations, velocity representation, and a description of the geometric aspects of the image formation process. We then present a taxonomy of visual servo control systems. The two major classes of systems, position-based and image-based systems, are then discussed. Since any visual servo system must be capable of tracking image features in a sequence of images, we include an overview of feature-based and correlation-based methods for tracking. We conclude the tutorial with a number of observations on the current directions of the research field of visual servo control. 1 Introduction Today there are over 800,000 robots in the world, mostly working in factory environment...
Contour Tracking By Stochastic Propagation of Conditional Density
, 1996
"... . In Proc. European Conf. Computer Vision, 1996, pp. 343--356, Cambridge, UK The problem of tracking curves in dense visual clutter is a challenging one. Trackers based on Kalman filters are of limited use; because they are based on Gaussian densities which are unimodal, they cannot represent s ..."
Abstract
-
Cited by 488 (23 self)
- Add to MetaCart
. In Proc. European Conf. Computer Vision, 1996, pp. 343--356, Cambridge, UK The problem of tracking curves in dense visual clutter is a challenging one. Trackers based on Kalman filters are of limited use; because they are based on Gaussian densities which are unimodal, they cannot represent simultaneous alternative hypotheses. Extensions to the Kalman filter to handle multiple data associations work satisfactorily in the simple case of point targets, but do not extend naturally to continuous curves. A new, stochastic algorithm is proposed here, the Condensation algorithm --- Conditional Density Propagation over time. It uses `factored sampling', a method previously applied to interpretation of static images, in which the distribution of possible interpretations is represented by a randomly generated set of representatives. The Condensation algorithm combines factored sampling with learned dynamical models to propagate an entire probability distribution for object pos...
Pictorial Structures for Object Recognition
- IJCV
, 2003
"... In this paper we present a statistical framework for modeling the appearance of objects. Our work is motivated by the pictorial structure models introduced by Fischler and Elschlager. The basic idea is to model an object by a collection of parts arranged in a deformable configuration. The appearance ..."
Abstract
-
Cited by 305 (13 self)
- Add to MetaCart
In this paper we present a statistical framework for modeling the appearance of objects. Our work is motivated by the pictorial structure models introduced by Fischler and Elschlager. The basic idea is to model an object by a collection of parts arranged in a deformable configuration. The appearance of each part is modeled separately, and the deformable configuration is represented by spring-like connections between pairs of parts. These models allow for qualitative descriptions of visual appearance, and are suitable for generic recognition problems. We use these models to address the problem of detecting an object in an image as well as the problem of learning an object model from training examples, and present efficient algorithms for both these problems. We demonstrate the techniques by learning models that represent faces and human bodies and using the resulting models to locate the corresponding objects in novel images.
Three-dimensional object recognition from single two-dimensional images
- Artificial Intelligence
, 1987
"... A computer vision system has been implemented that can recognize threedimensional objects from unknown viewpoints in single gray-scale images. Unlike most other approaches, the recognition is accomplished without any attempt to reconstruct depth information bottom-up from the visual input. Instead, ..."
Abstract
-
Cited by 303 (6 self)
- Add to MetaCart
A computer vision system has been implemented that can recognize threedimensional objects from unknown viewpoints in single gray-scale images. Unlike most other approaches, the recognition is accomplished without any attempt to reconstruct depth information bottom-up from the visual input. Instead, three other mechanisms are used that can bridge the gap between the two-dimensional image and knowledge of three-dimensional objects. First, a process of perceptual organization is used to form groupings and structures in the image that are likely to be invariant over a wide range of viewpoints. Second, a probabilistic ranking method is used to reduce the size of the search space during model based matching. Finally, a process of spatial correspondence brings the projections of three-dimensional models into direct correspondence with the image by solving for unknown viewpoint and model parameters. A high level of robustness in the presence of occlusion and missing data can be achieved through full application of a viewpoint consistency constraint. It is argued that similar mechanisms and constraints form the basis for recognition in human vision. This paper has been published in Artificial Intelligence, 31, 3 (March 1987), pp. 355–395. 1 1
Determining the Epipolar Geometry and its Uncertainty: A Review
- International Journal of Computer Vision
, 1998
"... Two images of a single scene/object are related by the epipolar geometry, which can be described by a 3×3 singular matrix called the essential matrix if images' internal parameters are known, or the fundamental matrix otherwise. It captures all geometric information contained in two images, an ..."
Abstract
-
Cited by 260 (7 self)
- Add to MetaCart
Two images of a single scene/object are related by the epipolar geometry, which can be described by a 3×3 singular matrix called the essential matrix if images' internal parameters are known, or the fundamental matrix otherwise. It captures all geometric information contained in two images, and its determination is very important in many applications such as scene modeling and vehicle navigation. This paper gives an introduction to the epipolar geometry, and provides a complete review of the current techniques for estimating the fundamental matrix and its uncertainty. A well-founded measure is proposed to compare these techniques. Projective reconstruction is also reviewed. The software which we have developed for this review is available on the Internet.
Fitting Parameterized Three-Dimensional Models to Images
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 1991
"... Model-based recognition and motion tracking depends upon the ability to solve for projection and model parameters that will best fit a 3-D model to matching 2-D image features. This paper extends current methods of parameter solving to handle objects with arbitrary curved surfaces and with any nu ..."
Abstract
-
Cited by 246 (7 self)
- Add to MetaCart
Model-based recognition and motion tracking depends upon the ability to solve for projection and model parameters that will best fit a 3-D model to matching 2-D image features. This paper extends current methods of parameter solving to handle objects with arbitrary curved surfaces and with any number of internal parameters representing articulations, variable dimensions, or surface deformations. Numerical
Photo Tourism: Exploring Photo Collections in 3D
- ACM TRANSACTIONS ON GRAPHICS
, 2006
"... We present a system for interactively browsing and exploring large unstructured collections of photographs of a scene using a novel 3D interface. Our system consists of an image-based modeling front end that automatically computes the viewpoint of each photograph as well as a sparse 3D model of the ..."
Abstract
-
Cited by 232 (20 self)
- Add to MetaCart
We present a system for interactively browsing and exploring large unstructured collections of photographs of a scene using a novel 3D interface. Our system consists of an image-based modeling front end that automatically computes the viewpoint of each photograph as well as a sparse 3D model of the scene and image to model correspondences. Our photo explorer uses image-based rendering techniques to smoothly transition between photographs, while also enabling full 3D navigation and exploration of the set of images and world geometry, along with auxiliary information such as overhead maps. Our system also makes it easy to construct photo tours of scenic or historic locations, and to annotate image details, which are automatically transferred to other relevant images. We demonstrate our system on several large personal photo collections as well as images gathered from Internet photo sharing sites.
Dynamic View Morphing
- In Proc. SIGGRAPH 96
, 1996
"... We present a novel technique for interpolating between two views of a dynamic scene. Our approach extends the concept of view morphing introduced in [SD96] and retains the relative advantages of that method. The interpolation will portray one possible physically-valid version of what transpired in t ..."
Abstract
-
Cited by 204 (20 self)
- Add to MetaCart
We present a novel technique for interpolating between two views of a dynamic scene. Our approach extends the concept of view morphing introduced in [SD96] and retains the relative advantages of that method. The interpolation will portray one possible physically-valid version of what transpired in the scene during the intervening time between views. The scene is assumed to consist of a small number of objects. Each object can undergo any motion during the time between views as long as its total movement is equivalent to a single, rigid translation. The dynamic view morphing technique can work with widely-spaced reference views, sparse point correspondences, and uncalibrated cameras. When the camera-to-camera transformation can be determined, the virtual objects can be portrayed moving along straight-line, constant-velocity trajectories. Methods are developed for determining the camera-to-camera transformation from information available in the reference views. It is shown that each mov...
An Efficient Solution to the Five-Point Relative Pose Problem
, 2004
"... An efficient algorithmic solution to the classical five-point relative pose problem is presented. The problem is to find the possible solutions for relative camera pose between two calibrated views given five corresponding points. The algorithm consists of computing the coefficients of a tenth degre ..."
Abstract
-
Cited by 204 (11 self)
- Add to MetaCart
An efficient algorithmic solution to the classical five-point relative pose problem is presented. The problem is to find the possible solutions for relative camera pose between two calibrated views given five corresponding points. The algorithm consists of computing the coefficients of a tenth degree polynomial in closed form and subsequently finding its roots. It is the first algorithm well suited for numerical implementation that also corresponds to the inherent complexity of the problem. We investigate the numerical precision of the algorithm. We also study its performance under noise in minimal as well as over-determined cases. The performance is compared to that of the well known 8 and 7-point methods and a 6-point scheme. The algorithm is used in a robust hypothesize-and-test framework to estimate structure and motion in real-time with low delay. The real-time system uses solely visual input and has been demonstrated at major conferences.

