Results 1  10
of
396
Consistency of spectral clustering
, 2004
"... Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spe ..."
Abstract

Cited by 286 (15 self)
 Add to MetaCart
Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spectral clustering algorithms, which cluster the data with the help of eigenvectors of graph Laplacian matrices. We show that one of the two of major classes of spectral clustering (normalized clustering) converges under some very general conditions, while the other (unnormalized), is only consistent under strong additional assumptions, which, as we demonstrate, are not always satisfied in real data. We conclude that our analysis provides strong evidence for the superiority of normalized spectral clustering in practical applications. We believe that methods used in our analysis will provide a basis for future exploration of Laplacianbased methods in a statistical setting.
Fast Monte Carlo Algorithms for Matrices II: Computing a LowRank Approximation to a Matrix
 SIAM Journal on Computing
, 2004
"... matrix A. It is often of interest to nd a lowrank approximation to A, i.e., an approximation D to the matrix A of rank not greater than a speci ed rank k, where k is much smaller than m and n. Methods such as the Singular Value Decomposition (SVD) may be used to nd an approximation to A which ..."
Abstract

Cited by 142 (17 self)
 Add to MetaCart
matrix A. It is often of interest to nd a lowrank approximation to A, i.e., an approximation D to the matrix A of rank not greater than a speci ed rank k, where k is much smaller than m and n. Methods such as the Singular Value Decomposition (SVD) may be used to nd an approximation to A which is the best in a well de ned sense. These methods require memory and time which are superlinear in m and n; for many applications in which the data sets are very large this is prohibitive. Two simple and intuitive algorithms are presented which, when given an m n matrix A, compute a description of a lowrank approximation D to A, and which are qualitatively faster than the SVD. Both algorithms have provable bounds for the error matrix A D . For any matrix X , let kXk and kXk 2 denote its Frobenius norm and its spectral norm, respectively. In the rst algorithm, c = O(1) columns of A are randomly chosen. If the m c matrix C consists of those c columns of A (after appropriate rescaling) then it is shown that from C C approximations to the top singular values and corresponding singular vectors may be computed. From the computed singular vectors a description D of the matrix A may be computed such that rank(D ) k and such that holds with high probability for both = 2; F . This algorithm may be implemented without storing the matrix A in Random Access Memory (RAM), provided it can make two passes over the matrix stored in external memory and use O(m + n) additional RAM memory. The second algorithm is similar except that it further approximates the matrix C by randomly sampling r = O(1) rows of C to form a r c matrix W . Thus, it has additional error, but it can be implemented in three passes over the matrix using only constant ...
On the Nyström Method for Approximating a Gram Matrix for Improved KernelBased Learning
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2005
"... A problem for many kernelbased methods is that the amount of computation required to find the solution scales as O(n³), where n is the number of training examples. We develop and analyze an algorithm to compute an easilyinterpretable lowrank approximation to an nn Gram matrix G such that compu ..."
Abstract

Cited by 108 (7 self)
 Add to MetaCart
A problem for many kernelbased methods is that the amount of computation required to find the solution scales as O(n³), where n is the number of training examples. We develop and analyze an algorithm to compute an easilyinterpretable lowrank approximation to an nn Gram matrix G such that computations of interest may be performed more rapidly. The approximation is of the form G k = CW , where C is a matrix consisting of a small number c of columns of G and W k is the best rankk approximation to W , the matrix formed by the intersection between those c columns of G and the corresponding c rows of G. An important aspect of the algorithm is the probability distribution used to randomly sample the columns; we will use a judiciouslychosen and datadependent nonuniform probability distribution. Let F denote the spectral norm and the Frobenius norm, respectively, of a matrix, and let G k be the best rankk approximation to G. We prove that by choosing O(k/# ) columns both in expectation and with high probability, for both # = 2, F , and for all k : 0 rank(W ). This approximation can be computed using O(n) additional space and time, after making two passes over the data from external storage. The relationships between this algorithm, other related matrix decompositions, and the Nyström method from integral equation theory are discussed.
The Mathematics Of Eigenvalue Optimization
, 2003
"... Optimization problems involving the eigenvalues of symmetric and nonsymmetric matrices present a fascinating mathematical challenge. Such problems arise often in theory and practice, particularly in engineering design, and are amenable to a rich blend of classical mathematical techniques and contemp ..."
Abstract

Cited by 92 (13 self)
 Add to MetaCart
Optimization problems involving the eigenvalues of symmetric and nonsymmetric matrices present a fascinating mathematical challenge. Such problems arise often in theory and practice, particularly in engineering design, and are amenable to a rich blend of classical mathematical techniques and contemporary optimization theory. This essay presents a personal choice of some central mathematical ideas, outlined for the broad optimization community. I discuss the convex analysis of spectral functions and invariant matrix norms, touching briey on semide nite representability, and then outlining two broader algebraic viewpoints based on hyperbolic polynomials and Lie algebra. Analogous nonconvex notions lead into eigenvalue perturbation theory. The last third of the article concerns stability, for polynomials, matrices, and associated dynamical systems, ending with a section on robustness. The powerful and elegant language of nonsmooth analysis appears throughout, as a unifying narrative thread.
Fast Monte Carlo algorithms for matrices I: Approximating matrix multiplication
 SIAM Journal on Computing
, 2004
"... ..."
Quantum mechanics as quantum information (and only a little more), Quantum Theory: Reconsideration of Foundations
, 2002
"... In this paper, I try once again to cause some goodnatured trouble. The issue remains, when will we ever stop burdening the taxpayer with conferences devoted to the quantum foundations? The suspicion is expressed that no end will be in sight until a means is found to reduce quantum theory to two or ..."
Abstract

Cited by 61 (6 self)
 Add to MetaCart
In this paper, I try once again to cause some goodnatured trouble. The issue remains, when will we ever stop burdening the taxpayer with conferences devoted to the quantum foundations? The suspicion is expressed that no end will be in sight until a means is found to reduce quantum theory to two or three statements of crisp physical (rather than abstract, axiomatic) significance. In this regard, no tool appears better calibrated for a direct assault than quantum information theory. Far from a strained application of the latest fad to a timehonored problem, this method holds promise precisely because a large part—but not all—of the structure of quantum theory has always concerned information. It is just that the physics community needs reminding. This paper, though takingquantph/0106166 as its core, corrects one mistake and offers several observations beyond the previous version. In particular, I identify one element of quantum mechanics that I would not label a subjective term in the theory—it is the integer parameter D traditionally ascribed to a quantum system via its Hilbertspace dimension. 1
On matrixvalued Herglotz functions
 Math. Nachr
, 2000
"... Abstract. We provide a comprehensive analysis of matrixvalued Herglotz functions and illustrate their applications in the spectral theory of selfadjoint Hamiltonian systems including matrixvalued Schrödinger and Diractype operators. Special emphasis is devoted to appropriate matrixvalued extens ..."
Abstract

Cited by 47 (23 self)
 Add to MetaCart
Abstract. We provide a comprehensive analysis of matrixvalued Herglotz functions and illustrate their applications in the spectral theory of selfadjoint Hamiltonian systems including matrixvalued Schrödinger and Diractype operators. Special emphasis is devoted to appropriate matrixvalued extensions of the wellknown AronszajnDonoghue theory concerning support properties of measures in their NevanlinnaRieszHerglotz representation. In particular, we study a class of linear fractional transformations MA(z) of a given n × n Herglotz matrix M(z) and prove that the minimal support of the absolutely continuos part of the measure associated to MA(z) is invariant under these linear fractional transformations. Additional applications discussed in detail include selfadjoint finiterank perturbations of selfadjoint operators, selfadjoint extensions of densely defined symmetric linear operators (especially, Friedrichs and Krein extensions), model operators for these two cases, and associated realization theorems for certain classes of Herglotz matrices. 1.
Spectral Partitioning with Indefinite Kernels Using the Nyström Extension
 European Conference on Computer Vision 2002
, 2002
"... ..."
Lower Bounds for Quantum Communication Complexity
 In Proceedings of 42nd IEEE FOCS
, 2001
"... Abstract. We prove new lower bounds for bounded error quantum communication complexity. Our methods are based on the Fourier transform of the considered functions. First we generalize a method for proving classical communication complexity lower bounds developed by Raz [34] to the quantum case. Appl ..."
Abstract

Cited by 44 (4 self)
 Add to MetaCart
Abstract. We prove new lower bounds for bounded error quantum communication complexity. Our methods are based on the Fourier transform of the considered functions. First we generalize a method for proving classical communication complexity lower bounds developed by Raz [34] to the quantum case. Applying this method we give an exponential separation between bounded error quantum communication complexity and nondeterministic quantum communication complexity. We develop several other lower bound methods based on the Fourier transform, notably showing that � ¯s(f) / log n, for the average sensitivity ¯s(f) of a function f, yields a lower bound on the bounded error quantum communication complexity of f(x ∧ y ⊕ z), where x is a Boolean word held by Alice and y, z are Boolean words held by Bob. We then prove the first large lower bounds on the bounded error quantum communication complexity of functions, for which a polynomial quantum speedup is possible. For all the functions we investigate, the only previously applied general lower bound method based on discrepancy yields bounds that are O(log n).
Spectral clustering for multitype relational data
 In ICML
, 2006
"... Clustering on multitype relational data has attracted more and more attention in recent years due to its high impact on various important applications, such as Web mining, ecommerce and bioinformatics. However, the research on general multitype relational data clustering is still limited and prel ..."
Abstract

Cited by 43 (5 self)
 Add to MetaCart
Clustering on multitype relational data has attracted more and more attention in recent years due to its high impact on various important applications, such as Web mining, ecommerce and bioinformatics. However, the research on general multitype relational data clustering is still limited and preliminary. The contribution of the paper is threefold. First, we propose a general model, the collective factorization on related matrices, for multitype relational data clustering. The model is applicable to relational data with various structures. Second, under this model, we derive a novel algorithm, the spectral relational clustering, to cluster multitype interrelated data objects simultaneously. The algorithm iteratively embeds each type of data objects into low dimensional spaces and benefits from the interactions among the hidden structures of different types of data objects. Extensive experiments demonstrate the promise and effectiveness of the proposed algorithm. Third, we show that the existing spectral clustering algorithms can be considered as the special cases of the proposed model and algorithm. This demonstrates the good theoretic generality of the proposed model and algorithm. 1.