Results 1 - 10
of
371
On Spectral Clustering: Analysis and an algorithm
- ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS
, 2001
"... Despite many empirical successes of spectral clustering methods -- algorithms that cluster points using eigenvectors of matrices derived from the distances between the points -- there are several unresolved issues. First, there is a wide variety of algorithms that use the eigenvectors in slightly ..."
Abstract
-
Cited by 756 (7 self)
- Add to MetaCart
Despite many empirical successes of spectral clustering methods -- algorithms that cluster points using eigenvectors of matrices derived from the distances between the points -- there are several unresolved issues. First, there is a wide variety of algorithms that use the eigenvectors in slightly different ways. Second, many of these algorithms have no proof that they will actually compute a reasonable clustering. In this paper, we present a simple spectral clustering algorithm that can be implemented using a few lines of Matlab. Using tools from matrix perturbation theory, we analyze the algorithm, and give conditions under which it can be expected to do well. We also show surprisingly good experimental results on a number of challenging clustering problems.
Consistency of spectral clustering
, 2004
"... Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spe ..."
Abstract
-
Cited by 170 (11 self)
- Add to MetaCart
Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spectral clustering algorithms, which cluster the data with the help of eigenvectors of graph Laplacian matrices. We show that one of the two of major classes of spectral clustering (normalized clustering) converges under some very general conditions, while the other (unnormalized), is only consistent under strong additional assumptions, which, as we demonstrate, are not always satisfied in real data. We conclude that our analysis provides strong evidence for the superiority of normalized spectral clustering in practical applications. We believe that methods used in our analysis will provide a basis for future exploration of Laplacian-based methods in a statistical setting.
Single View Metrology
, 1999
"... We describe how 3D affine measurements may be computed from a single perspective view of a scene given only minimal geometric information determined from the image. This minimal information is typically the vanishing line of a reference plane, and a vanishing point for a direction not parallel to th ..."
Abstract
-
Cited by 120 (3 self)
- Add to MetaCart
We describe how 3D affine measurements may be computed from a single perspective view of a scene given only minimal geometric information determined from the image. This minimal information is typically the vanishing line of a reference plane, and a vanishing point for a direction not parallel to the plane. It is shown that affine scene structure may then be determined from the image, without knowledge of the camera's internal calibration (e.g. focal length), nor of the explicit relation between camera and world (pose). In particular, we show how to (i) compute the distance between planes parallel to the reference plane (up to a common scale factor); (ii) compute area and length ratios on any plane parallel to the reference plane; (iii) determine the camera's (viewer's) location. Simple geometric derivations are given for these results. We also develop an algebraic representation which unifies the three types of measurement and, amongst other advantages, permits a first order error pr...
Fast Monte Carlo Algorithms for Matrices II: Computing a Low-Rank Approximation to a Matrix
- SIAM Journal on Computing
, 2004
"... matrix A. It is often of interest to nd a low-rank approximation to A, i.e., an approximation D to the matrix A of rank not greater than a speci ed rank k, where k is much smaller than m and n. Methods such as the Singular Value Decomposition (SVD) may be used to nd an approximation to A which ..."
Abstract
-
Cited by 99 (17 self)
- Add to MetaCart
matrix A. It is often of interest to nd a low-rank approximation to A, i.e., an approximation D to the matrix A of rank not greater than a speci ed rank k, where k is much smaller than m and n. Methods such as the Singular Value Decomposition (SVD) may be used to nd an approximation to A which is the best in a well de ned sense. These methods require memory and time which are superlinear in m and n; for many applications in which the data sets are very large this is prohibitive. Two simple and intuitive algorithms are presented which, when given an m n matrix A, compute a description of a low-rank approximation D to A, and which are qualitatively faster than the SVD. Both algorithms have provable bounds for the error matrix A D . For any matrix X , let kXk and kXk 2 denote its Frobenius norm and its spectral norm, respectively. In the rst algorithm, c = O(1) columns of A are randomly chosen. If the m c matrix C consists of those c columns of A (after appropriate rescaling) then it is shown that from C C approximations to the top singular values and corresponding singular vectors may be computed. From the computed singular vectors a description D of the matrix A may be computed such that rank(D ) k and such that holds with high probability for both = 2; F . This algorithm may be implemented without storing the matrix A in Random Access Memory (RAM), provided it can make two passes over the matrix stored in external memory and use O(m + n) additional RAM memory. The second algorithm is similar except that it further approximates the matrix C by randomly sampling r = O(1) rows of C to form a r c matrix W . Thus, it has additional error, but it can be implemented in three passes over the matrix using only constant ...
Subspace Linear Discriminant Analysis for Face Recognition
, 1999
"... In this paper we describe a holistic face recognition method based on subspace Linear Discriminant Analysis (LDA). The method consists of two steps: first we project the face image from the original vector space to a face subspace via Principal Component Analysis where the subspace dimension is care ..."
Abstract
-
Cited by 75 (8 self)
- Add to MetaCart
In this paper we describe a holistic face recognition method based on subspace Linear Discriminant Analysis (LDA). The method consists of two steps: first we project the face image from the original vector space to a face subspace via Principal Component Analysis where the subspace dimension is carefully chosen, and then use LDA to obtain a linear classifier in the subspace. The criterion we use to choose the subspace dimension enables us to generate class-separable features via LDA. In addition, we employ a weighted distance metric guided by the LDA eigenvalues to improve the performance of the subspace LDA method. Finally, the improved performance of the subspace LDA approach is demonstrated through experiments using the FERET dataset for face recognition/verification, a large mugshot dataset for person verification, and the MPEG-7 dataset. 1 Partially supported by the Office of Naval Research under Grant N00014-95-1-0521. I. Introduction The problem of automatic face recognition...
A Chernoff Bound For Random Walks On Expander Graphs
- SIAM J. Comput
, 1998
"... . We consider a finite random walk on a weighted graph G; we show that the fraction of time spent in a set of vertices A converges to the stationary probability #(A) with error probability exp ..."
Abstract
-
Cited by 66 (0 self)
- Add to MetaCart
.<F3.827e+05> We consider a finite random walk on a weighted graph<F3.539e+05><F3.827e+05> G; we show that the fraction of time spent in a set of vertices<F3.539e+05> A<F3.827e+05> converges to the stationary probability<F3.539e+05><F3.827e+05><F3.539e+05><F3.827e+05> #(A) with error probability exponentially small in the length of the random walk and the square of the size of the deviation from<F3.539e+05><F3.827e+05><F3.539e+05><F3.827e+05> #(A). The exponential bound is in terms of the expansion of<F3.539e+05> G<F3.827e+05> and improves previous results of [D. Aldous,<F3.405e+05> Probab. Engrg. Inform.<F3.827e+05> Sci., 1 (1987), pp. 33--46], [L. Lovasz and M. Simonovits,<F3.405e+05> Random Structures<F3.827e+05> Algorithms, 4 (1993), pp. 359--412], [M. Ajtai, J. Komlos, and E. Szemeredi,<F3.405e+05> Deterministic simulation of<F3.827e+05> logspace, in Proc. 19th ACM Symp. on Theory of Computing, 1987]. We show that taking the sample average from one trajectory gives a more e#cien...
Three-Dimensional Face Recognition
, 2005
"... An expression-invariant 3D face recognition approach is presented. Our basic assumption is that facial expressions can be modelled as isometries of the facial surface. This allows to construct expression-invariant representations of faces using the bending-invariant canonical forms approach. The re ..."
Abstract
-
Cited by 64 (22 self)
- Add to MetaCart
An expression-invariant 3D face recognition approach is presented. Our basic assumption is that facial expressions can be modelled as isometries of the facial surface. This allows to construct expression-invariant representations of faces using the bending-invariant canonical forms approach. The result is an efficient and accurate face recognition algorithm, robust to facial expressions, that can distinguish between identical twins (the first two authors). We demonstrate a prototype system based on the proposed algorithm and compare its performance to classical face recognition methods. The numerical methods employed by our approach do not require the facial surface explicitly. The surface gradients field, or the surface metric, are sufficient for constructing the expression-invariant representation of any given face. It allows us to perform the 3D face recognition task while avoiding the surface reconstruction stage.
On the Early History of the Singular Value Decomposition
, 1992
"... This paper surveys the contributions of five mathematicians --- Eugenio Beltrami (1835--1899), Camille Jordan (1838--1921), James Joseph Sylvester (1814--1897), Erhard Schmidt (1876--1959), and Hermann Weyl (1885--1955) --- who were responsible for establishing the existence of the singular value de ..."
Abstract
-
Cited by 63 (1 self)
- Add to MetaCart
This paper surveys the contributions of five mathematicians --- Eugenio Beltrami (1835--1899), Camille Jordan (1838--1921), James Joseph Sylvester (1814--1897), Erhard Schmidt (1876--1959), and Hermann Weyl (1885--1955) --- who were responsible for establishing the existence of the singular value decomposition and developing its theory.
Spectral Analysis of Internet Topologies
, 2003
"... We perform spectral analysis of the Internet topology at the AS level, by adapting the standard spectral filtering method of examining the eigenvectors corresponding to the largest eigenvalues of matrices related to the adjacency matrix of the topology. We observe that the method suggests clusters o ..."
Abstract
-
Cited by 63 (7 self)
- Add to MetaCart
We perform spectral analysis of the Internet topology at the AS level, by adapting the standard spectral filtering method of examining the eigenvectors corresponding to the largest eigenvalues of matrices related to the adjacency matrix of the topology. We observe that the method suggests clusters of ASes with natural semantic proximity, such as geography or business interests. We examine how these clustering properties vary in the core and in the edge of the network, as well as across geographic areas, over time, and between real and synthetic data. We observe that these clustering properties may be suggestive of traffic patterns and thus have direct impact on the link stress of the network. Finally, we use the weights of the eigenvector corresponding to the first eigenvalue to obtain an alternative hierarchical ranking of the ASes.
Some Perturbation Theory for Linear Programming
- Mathematical Programming
, 1992
"... This paper examines a few relations between solution characteristics of an LP and the amount by which the LP must be perturbed to obtain either a primal infeasible LP or a dual infeasible LP. We consider such solution characteristics as the size of the optimal solution and the sensitivity of the opt ..."
Abstract
-
Cited by 59 (2 self)
- Add to MetaCart
This paper examines a few relations between solution characteristics of an LP and the amount by which the LP must be perturbed to obtain either a primal infeasible LP or a dual infeasible LP. We consider such solution characteristics as the size of the optimal solution and the sensitivity of the optimal value to data perturbations. We show, for example, that an LP has a large optimal solution, or has a sensitive optimal value, only if the instance is nearly primal infeasible or dual infeasible. The results are not particularly surprising but they do formalize an interesting viewpoint which apparently has not been made explicit in the linear programming literature. The results are rather general. Several of the results are valid for linear programs defined in arbitrary real normed spaces. A Hahn-Banach Theorem is the main tool employed in the analysis; given a closed convex set in a normed vector space and a point in the space but not in the set, there exists a continuous linear functional strictly separating the set from the point. We introduce notation, then the results. Let X;Y denote real vector spaces, each with a norm. We use the same notation (i.e. k k) for all norms, it being clear from context which norm is referred to. Let X

