Results 11 - 20
of
1,684
Advances in Computational Stereo
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2003
"... Extraction of three-dimensional structure of a scene from stereo images is a problem that has been studied by the computer vision community for decades. Early work focused on the fundamentals of image correspondence and stereo geometry. Stereo ..."
Abstract
-
Cited by 90 (2 self)
- Add to MetaCart
Extraction of three-dimensional structure of a scene from stereo images is a problem that has been studied by the computer vision community for decades. Early work focused on the fundamentals of image correspondence and stereo geometry. Stereo
Unified inverse depth parametrization for monocular slam
- In Proceedings of Robotics: Science and Systems
, 2006
"... Abstract—We present a new parametrization for point features within monocular simultaneous localization and mapping (SLAM) that permits efficient and accurate representation of uncertainty during undelayed initialization and beyond, all within the standard extended Kalman filter (EKF). The key conce ..."
Abstract
-
Cited by 77 (11 self)
- Add to MetaCart
Abstract—We present a new parametrization for point features within monocular simultaneous localization and mapping (SLAM) that permits efficient and accurate representation of uncertainty during undelayed initialization and beyond, all within the standard extended Kalman filter (EKF). The key concept is direct parametrization of the inverse depth of features relative to the camera locations from which they were first viewed, which produces measurement equations with a high degree of
Total recall: Automatic query expansion with a generative feature model for object retrieval
- In Proc. ICCV
, 2007
"... Given a query image of an object, our objective is to retrieve all instances of that object in a large (1M+) image database. We adopt the bag-of-visual-words architecture which has proven successful in achieving high precision at low recall. Unfortunately, feature detection and quantization are nois ..."
Abstract
-
Cited by 77 (10 self)
- Add to MetaCart
Given a query image of an object, our objective is to retrieve all instances of that object in a large (1M+) image database. We adopt the bag-of-visual-words architecture which has proven successful in achieving high precision at low recall. Unfortunately, feature detection and quantization are noisy processes and this can result in variation in the particular visual words that appear in different images of the same object, leading to missed results. In the text retrieval literature a standard method for improving performance is query expansion. A number of the highly ranked documents from the original query are reissued as a new query. In this way, additional relevant terms can be added to the query. This is a form of blind relevance feedback and it can fail if ‘outlier ’ (false positive) documents are included in the reissued query. In this paper we bring query expansion into the visual domain via two novel contributions. Firstly, strong spatial constraints between the query image and each result allow us to accurately verify each return, suppressing the false positives which typically ruin text-based query expansion. Secondly, the verified images can be used to learn a latent feature model to enable the controlled construction of expanded queries. We illustrate these ideas on the 5000 annotated image Oxford building database together with more than 1M Flickr images. We show that the precision is substantially boosted, achieving total recall in many cases. 1.
Image Change Detection Algorithms: A Systematic Survey
- IEEE Transactions on Image Processing
, 2005
"... Detecting regions of change in multiple images of the same scene taken at different times is of widespread interest due to a large number of applications in diverse disciplines, including remote sensing, surveillance, medical diagnosis and treatment, civil infrastructure, and underwater sensing. T ..."
Abstract
-
Cited by 64 (0 self)
- Add to MetaCart
Detecting regions of change in multiple images of the same scene taken at different times is of widespread interest due to a large number of applications in diverse disciplines, including remote sensing, surveillance, medical diagnosis and treatment, civil infrastructure, and underwater sensing. This paper presents a systematic survey of the common processing steps and core decision rules in modern change detection algorithms, including significance and hypothesis testing, predictive models, the shading model, and background modeling. We also discuss important preprocessing methods, approaches to enforcing the consistency of the change mask, and principles for evaluating and comparing the performance of change detection algorithms. It is hoped that our classification of algorithms into a relatively small number of categories will provide useful guidance to the algorithm designer.
A Framework for Robust Subspace Learning
- International Journal of Computer Vision
, 2003
"... Many computer vision, signal processing and statistical problems can be posed as problems of learning low dimensional linear or multi-linear models. These models have been widely used for the representation of shape, appearance, motion, etc, in computer vision applications. ..."
Abstract
-
Cited by 61 (5 self)
- Add to MetaCart
Many computer vision, signal processing and statistical problems can be posed as problems of learning low dimensional linear or multi-linear models. These models have been widely used for the representation of shape, appearance, motion, etc, in computer vision applications.
Simultaneous Linear Estimation of Multiple View Geometry and Lens Distortion
, 2001
"... A bugbear of uncalibrated stereo reconstruction is that cameras which deviate from the pinhole model have to be pre-calibrated in order to correct for nonlinear lens distortion. If they are not, and point correspondence is attempted using the uncorrected images, the matching constraints provided by ..."
Abstract
-
Cited by 60 (1 self)
- Add to MetaCart
A bugbear of uncalibrated stereo reconstruction is that cameras which deviate from the pinhole model have to be pre-calibrated in order to correct for nonlinear lens distortion. If they are not, and point correspondence is attempted using the uncorrected images, the matching constraints provided by the fundamental matrix must be set so loose that point matching is significantly hampered. This paper shows how linear estimation of the fundamental matrix from two-view point correspondences may be augmented to include one term of radial lens distortion. This is achieved by (1) changing from the standard radiallens model to another which (as we show) has equivalent power, but which takes a simpler form in homogeneous coordinates, and (2) expressing fundamental matrix estimation as a Quadratic Eigenvalue Problem (QEP), for which efficient algorithms are well known. I derive the new estimator, and compare its performance against bundle-adjusted calibration-grid data. The new estimator is fast enough to be included in a RANSAC-based matching loop, and we show cases of matching being rendered possible by its use. I show how the same lens can be calibrated in a natural scene where the lack of straight lines precludes most previous techniques. The modification when the multi-view relation is a planar homography or trifocal tensor is described. 1.
Video Compass
- In Proc. ECCV
, 2002
"... Abstract. In this paper we describe a flexible approach for determining the relative orientation of the camera with respect to the scene. The main premise of the approach is the fact that in man-made environments, the majority of lines is aligned with the principal orthogonal directions of the world ..."
Abstract
-
Cited by 60 (5 self)
- Add to MetaCart
Abstract. In this paper we describe a flexible approach for determining the relative orientation of the camera with respect to the scene. The main premise of the approach is the fact that in man-made environments, the majority of lines is aligned with the principal orthogonal directions of the world coordinate frame. We exploit this observation towards efficient detection and estimation of vanishing points, which provide strong constraints on camera parameters and relative orientation of the camera with respect to the scene. By combining efficient image processing techniques in the line detection and initialization stage we demonstrate that simultaneous grouping and estimation of vanishing directions can be achieved in the absence of internal parameters of the camera. Constraints between vanishing points are then used for partial calibration and relative rotation estimation. The algorithm has been tested in a variety of indoors and outdoors scenes and its efficiency and automation makes it amenable for implementation on robotic platforms. Key words: Vanishing point estimation, relative orientation, calibration using vanishing points, vision guided mobile and aerial robots. 1
3D Object modeling and recognition using local affine-invariant image descriptors and multi-view spatial constraints
- International Journal of Computer Vision
, 2006
"... Abstract. This article introduces a novel representation for three-dimensional (3D) objects in terms of local affine-invariant descriptors of their images and the spatial relationships between the corresponding surface patches. Geometric constraints associated with different views of the same patche ..."
Abstract
-
Cited by 58 (11 self)
- Add to MetaCart
Abstract. This article introduces a novel representation for three-dimensional (3D) objects in terms of local affine-invariant descriptors of their images and the spatial relationships between the corresponding surface patches. Geometric constraints associated with different views of the same patches under affine projection are combined with a normalized representation of their appearance to guide matching and reconstruction, allowing the acquisition of true 3D affine and Euclidean models from multiple unregistered images, as well as their recognition in photographs taken from arbitrary viewpoints. The proposed approach does not require a separate segmentation stage, and it is applicable to highly cluttered scenes. Modeling and recognition results are presented.
On affine invariant clustering and automatic cast listing in movies
- In Proc. ECCV
, 2002
"... Abstract We develop a distance metric for clustering and classification algorithms which is invariant to affine transformations and includes priors on the transformation parameters. Such clustering requirements are generic to a number of problems in computer vision. We extend existing techniques for ..."
Abstract
-
Cited by 57 (13 self)
- Add to MetaCart
Abstract We develop a distance metric for clustering and classification algorithms which is invariant to affine transformations and includes priors on the transformation parameters. Such clustering requirements are generic to a number of problems in computer vision. We extend existing techniques for affine-invariant clustering, and show that the new distance metric outperforms existing approximations to affine invariant distance computation, particularly under large transformations. In addition, we incorporate prior probabilities on the transformation parameters. This further regularizes the solution, mitigating a rare but serious tendency of the existing solutions to diverge. For the particular special case of corresponding point sets we demonstrate that the affine invariant measure we introduced may be obtained in closed form. As an application of these ideas we demonstrate that the faces of the principal cast of a feature film can be generated automatically using clustering with appropriate invariance. This is a very demanding test as it involves detecting and clustering over tens of thousands of images with the variances including changes in viewpoint, lighting, scale and expression. 1
3D Object Modeling and Recognition Using Affine-Invariant Patches and Multi-View Spatial Constraints
"... This paper presents a novel representation for three-dimensional objects in terms of affine-invariant image patches and their spatial relationships. Multi-view constraints associated with groups of patches are combined with a normalized representation of their appearance to guide matching and recons ..."
Abstract
-
Cited by 57 (8 self)
- Add to MetaCart
This paper presents a novel representation for three-dimensional objects in terms of affine-invariant image patches and their spatial relationships. Multi-view constraints associated with groups of patches are combined with a normalized representation of their appearance to guide matching and reconstruction, allowing the acquisition of true three-dimensional affine and Euclidean models from multiple images and their recognition in a single photograph taken from an arbitrary viewpoint. The proposed approach does not require a separate segmentation stage and is applicable to cluttered scenes. Preliminary modeling and recognition results are presented.

