Results 1  10
of
52
From Few to many: Illumination cone models for face recognition under variable lighting and pose
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2001
"... We present a generative appearancebased method for recognizing human faces under variation in lighting and viewpoint. Our method exploits the fact that the set of images of an object in fixed pose, but under all possible illumination conditions, is a convex cone in the space of images. Using a smal ..."
Abstract

Cited by 438 (12 self)
 Add to MetaCart
We present a generative appearancebased method for recognizing human faces under variation in lighting and viewpoint. Our method exploits the fact that the set of images of an object in fixed pose, but under all possible illumination conditions, is a convex cone in the space of images. Using a small number of training images of each face taken with different lighting directions, the shape and albedo of the face can be reconstructed. In turn, this reconstruction serves as a generative model that can be used to render—or synthesize—images of the face under novel poses and illumination conditions. The pose space is then sampled, and for each pose the corresponding illumination cone is approximated by a lowdimensional linear subspace whose basis vectors are estimated using the generative model. Our recognition algorithm assigns to a test image the identity of the closest approximated illumination cone (based on Euclidean distance within the image space). We test our face recognition method on 4050 images from the Yale Face Database B; these images contain 405 viewing conditions (9 poses ¢ 45 illumination conditions) for 10 individuals. The method performs almost without error, except on the most extreme lighting directions, and significantly outperforms popular recognition methods that do not use a generative model.
What is the Set of Images of an Object Under All Possible Lighting Conditions
 IEEE CVPR
, 1996
"... The appearance of a particular object depends on both the viewpoint from which it is observed and the light sources by which it is illuminated. If the appearance of two objects is never identical for any pose or lighting conditions, then in theory the objects can always be distinguished or recogni ..."
Abstract

Cited by 325 (27 self)
 Add to MetaCart
The appearance of a particular object depends on both the viewpoint from which it is observed and the light sources by which it is illuminated. If the appearance of two objects is never identical for any pose or lighting conditions, then in theory the objects can always be distinguished or recognized. The question arises: What is the set of images of an object under all lighting conditions and pose? In this paper, ive consider only the set of images of an object under variable allumination (including multiple, extended light sources and attached shadows). We prove that the set of npixel images of a convex object with a Lambertian reflectance function, illuminated by an arbitrary number of point light sources at infinity, forms a convex polyhedral cone in IR " and that the dimension of this illumination cone equals the number of distinct surface normals. Furthermore, we show that the cone for a particular object can be constructed from three properly chosen images. Finally, we prove that the set of npixel images of an object of any shape and with an arbitrary reflectance function, seen under all possible illumination conditions, still forms a convex cone in Rn. Th.ese results immediately suggest certain approaches to object recognition. Throughout this paper, we ofler results demonstrating the empirical validity of the illumination cone representation. 1
Automatic Camera Recovery for Closed or Open Image Sequences
 In Proc. ECCV
, 1998
"... . We describe progress in completely automatically recovering 3D scene structure together with 3D camera positions from a sequence of images acquired by an unknown camera undergoing unknown movement. The main departure from previous structure from motion strategies is that processing is not sequenti ..."
Abstract

Cited by 216 (17 self)
 Add to MetaCart
. We describe progress in completely automatically recovering 3D scene structure together with 3D camera positions from a sequence of images acquired by an unknown camera undergoing unknown movement. The main departure from previous structure from motion strategies is that processing is not sequential. Instead a hierarchical approach is employed building from image triplets and associated trifocal tensors. This is advantageous both in obtaining correspondences and also in optimally distributing error over the sequence. The major step forward is that closed sequences can now be dealt with easily. That is, sequences where part of a scene is revisited at a later stage in the sequence. Such sequences contain additional constraints, compared to open sequences, from which the reconstruction can now benefit. The computed cameras and structure are the backbone of a system to build texture mapped graphical models directly from image sequences. 1 Introduction The goal of this work is to obtain ...
Illumination cones for recognition under variable lighting: Faces
 In Proc. IEEE Conf. on Comp. Vision and
, 1998
"... Due to illumination variability, the same object can appear dramatically di erent even when viewed in xed pose. To handle this variability, an object recognition system must employ a representation that is either invariant to, or models this variability. This paper presents an appearancebased metho ..."
Abstract

Cited by 97 (15 self)
 Add to MetaCart
Due to illumination variability, the same object can appear dramatically di erent even when viewed in xed pose. To handle this variability, an object recognition system must employ a representation that is either invariant to, or models this variability. This paper presents an appearancebased method formodeling the variability due to illumination in the images of objects. The method di ers from past appearancebased methods, however, in that a small set of training images is used to generate a representation { the illumination cone { which models the complete set of images of an object with Lambertian re ectance under an arbitrary combination of point light sources at in nity. This method isboth an implementation and extension (an extension in that it models cast shadows) of the illumination cone representation proposed in[3]. The method is tested on a database of 660 images of 10 faces, and the results exceed those of popular existing methods. 1
Classification with NonMetric Distances: Image Retrieval and Class Representation
, 2000
"... One of the key problems in appearancebased vision is understanding how to use a set of labeled images to classify new images. Classification systems that can model human performance, or that use robust image matching methods, often make use of similarity judgments that are nonmetric; but when the ..."
Abstract

Cited by 72 (0 self)
 Add to MetaCart
One of the key problems in appearancebased vision is understanding how to use a set of labeled images to classify new images. Classification systems that can model human performance, or that use robust image matching methods, often make use of similarity judgments that are nonmetric; but when the triangle inequality is not obeyed, most existing pattern recognition techniques are not applicable. We note that exemplarbased (or nearestneighbor) methods can be applied naturally when using a wide class of nonmetric similarity functions. The key issue, however, is to find methods for choosing good representatives of a class that accurately characterize it. We show that existing condensing techniques for finding class representatives are illsuited to deal with nonmetric dataspaces. We then focus on developing techniques for solving this problem, emphasizing two points: First, we show that the distance between two images is not a good measure of how well one image can represent ...
Robust Rotation and Translation Estimation in Multiview Reconstruction
"... It is known that the problem of multiview reconstruction can be solved in two steps: first estimate camera rotations and then translations using them. This paper presents new robust techniques for both of these steps. (i) Given pairwise relative rotations, global camera rotations are estimated linea ..."
Abstract

Cited by 47 (4 self)
 Add to MetaCart
It is known that the problem of multiview reconstruction can be solved in two steps: first estimate camera rotations and then translations using them. This paper presents new robust techniques for both of these steps. (i) Given pairwise relative rotations, global camera rotations are estimated linearly in least squares. (ii) Camera translations are estimated using a standard technique based on Second Order Cone Programming. Robustness is achieved by using only a subset of points according to a new criterion that diminishes the risk of chosing a mismatch. It is shown that only four points chosen in a special way are sufficient to represent a pairwise reconstruction almost equally as all points. This leads to a significant speedup. In image sets with repetitive or similar structures, nonexistent epipolar geometries may be found. Due to them, some rotations and consequently translations may be estimated incorrectly. It is shown that iterative removal of pairwise reconstructions with the largest residual and reregistration removes most nonexistent epipolar geometries. The performance of the proposed method is demonstrated on difficult wide baseline image sets. 1.
Linear Fitting with Missing Data for StructurefromMotion
 Computer Vision and Image Understanding
, 1997
"... this paper. This method is described in detail in [15]. We can briefly describe the method as formulating the least squares problem as a bilinear optimization, and then iteratively holding one set of variables constant while the others are optimized, so that each optimization is linear. We use their ..."
Abstract

Cited by 45 (6 self)
 Add to MetaCart
this paper. This method is described in detail in [15]. We can briefly describe the method as formulating the least squares problem as a bilinear optimization, and then iteratively holding one set of variables constant while the others are optimized, so that each optimization is linear. We use their method in our experiments, because it has good convergence properties and is easy to implement. For the problem they consider, Shum et al. state that a random starting point is sufficient to produce a good final solution. However, their experiments on this point cannot be used to draw conclusions for the problem of determining 3D structure from a sequence of 2D images. 3 A Novel Algorithm
Structure from Many Perspective Images with Occlusions
, 2002
"... This paper proposes a method for recovery of projective shape and motion from multiple images by factorization of a matrix containing the images of all scene points. Compared to previous methods, this method can handle perspective views and occlusions jointly. The projective depths of image points a ..."
Abstract

Cited by 28 (11 self)
 Add to MetaCart
This paper proposes a method for recovery of projective shape and motion from multiple images by factorization of a matrix containing the images of all scene points. Compared to previous methods, this method can handle perspective views and occlusions jointly. The projective depths of image points are estimated by the method of Sturm & Triggs [11] using epipolar geometry. Occlusions are solved by the extension of the method by Jacobs [8] for filling of missing data. This extension can exploit the geometry of perspective camera so that both points with known and unknown projective depths are used. Many ways of combining the two methods exist, and therefore several of them have been examined and the one with the best results is presented. The new method gives accurate results in practical situations, as demonstrated here with a series of experiments on laboratory and outdoor image sets. It becomes clear that the method is particularly suited for wide baseline multiple view stereo.
A Survey of SpatioTemporal Grouping Techniques
, 2002
"... Spatiotemporal segmentation of video sequences attempts to extract backgrounds and independent objects in the dynamic scenes captured in the sequences. It is an essential step of video analysis. It has important applications in video coding, video logging, indexing and retrieval, and more generally ..."
Abstract

Cited by 26 (0 self)
 Add to MetaCart
Spatiotemporal segmentation of video sequences attempts to extract backgrounds and independent objects in the dynamic scenes captured in the sequences. It is an essential step of video analysis. It has important applications in video coding, video logging, indexing and retrieval, and more generally in scene interpretation and video understanding. We classify spatiotemporal grouping techniques into three categories: (1) segmentation with spatial priority, (2) segmentation by trajectory grouping, and (3) joint spatial and temporal segmentation. The first category is the broadest, as it inherits the legacy techniques of image segmentation and motion segmentation. The other two categories place a higher priority on the accumulation of evidence along the temporal dimension and are more recent developments made feasible by the increased availability of computing power. For each category we provide a taxonomy of the techniques used to produce meaningful pixel groupings.