Results 1  10
of
39
Unsupervised Learning of Image Manifolds by Semidefinite Programming
, 2004
"... Can we detect low dimensional structure in high dimensional data sets of images and video? The problem of dimensionality reduction arises often in computer vision and pattern recognition. In this paper, we propose a new solution to this problem based on semidefinite programming. Our algorithm can be ..."
Abstract

Cited by 168 (9 self)
 Add to MetaCart
Can we detect low dimensional structure in high dimensional data sets of images and video? The problem of dimensionality reduction arises often in computer vision and pattern recognition. In this paper, we propose a new solution to this problem based on semidefinite programming. Our algorithm can be used to analyze high dimensional data that lies on or near a low dimensional manifold. It overcomes certain limitations of previous work in manifold learning, such as Isomap and locally linear embedding. We illustrate the algorithm on easily visualized examples of curves and surfaces, as well as on actual images of faces, handwritten digits, and solid objects.
Enhancing Sparsity by Reweighted ℓ1 Minimization
, 2007
"... It is now well understood that (1) it is possible to reconstruct sparse signals exactly from what appear to be highly incomplete sets of linear measurements and (2) that this can be done by constrained ℓ1 minimization. In this paper, we study a novel method for sparse signal recovery that in many si ..."
Abstract

Cited by 76 (4 self)
 Add to MetaCart
It is now well understood that (1) it is possible to reconstruct sparse signals exactly from what appear to be highly incomplete sets of linear measurements and (2) that this can be done by constrained ℓ1 minimization. In this paper, we study a novel method for sparse signal recovery that in many situations outperforms ℓ1 minimization in the sense that substantially fewer measurements are needed for exact recovery. The algorithm consists of solving a sequence of weighted ℓ1minimization problems where the weights used for the next iteration are computed from the value of the current solution. We present a series of experiments demonstrating the remarkable performance and broad applicability of this algorithm in the areas of sparse signal recovery, statistical estimation, error correction and image processing. Interestingly, superior gains are also achieved when our method is applied to recover signals with assumed nearsparsity in overcomplete representations—not by reweighting the ℓ1 norm of the coefficient sequence as is common, but by reweighting the ℓ1 norm of the transformed object. An immediate consequence is the possibility of highly efficient data acquisition protocols by improving on a technique known as compressed sensing.
Nonlinear dimensionality reduction by semidefinite programming and kernel matrix factorization
 in Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics
, 2005
"... We describe an algorithm for nonlinear dimensionality reduction based on semidefinite programming and kernel matrix factorization. The algorithm learns a kernel matrix for high dimensional data that lies on or near a low dimensional manifold. In earlier work, the kernel matrix was learned by maximiz ..."
Abstract

Cited by 48 (5 self)
 Add to MetaCart
We describe an algorithm for nonlinear dimensionality reduction based on semidefinite programming and kernel matrix factorization. The algorithm learns a kernel matrix for high dimensional data that lies on or near a low dimensional manifold. In earlier work, the kernel matrix was learned by maximizing the variance in feature space while preserving the distances and angles between nearest neighbors. In this paper, adapting recent ideas from semisupervised learning on graphs, we show that the full kernel matrix can be very well approximated by a product of smaller matrices. Representing the kernel matrix in this way, we can reformulate the semidefinite program in terms of a much smaller submatrix of inner products between randomly chosen landmarks. The new framework leads to orderofmagnitude reductions in computation time and makes it possible to study much larger problems in manifold learning. 1
Enhacing sparsity by reweighted ℓ1 minimization
 Journal of Fourier Analysis and Applications
, 2008
"... It is now well understood that (1) it is possible to reconstruct sparse signals exactly from what appear to be highly incomplete sets of linear measurements and (2) that this can be done by constrained ℓ1 minimization. In this paper, we study a novel method for sparse signal recovery that in many si ..."
Abstract

Cited by 34 (1 self)
 Add to MetaCart
It is now well understood that (1) it is possible to reconstruct sparse signals exactly from what appear to be highly incomplete sets of linear measurements and (2) that this can be done by constrained ℓ1 minimization. In this paper, we study a novel method for sparse signal recovery that in many situations outperforms ℓ1 minimization in the sense that substantially fewer measurements are needed for exact recovery. The algorithm consists of solving a sequence of weighted ℓ1minimization problems where the weights used for the next iteration are computed from the value of the current solution. We present a series of experiments demonstrating the remarkable performance and broad applicability of this algorithm in the areas of sparse signal recovery, statistical estimation, error correction and image processing. Interestingly, superior gains are also achieved when our method is applied to recover signals with assumed nearsparsity in overcomplete representations—not by reweighting the ℓ1 norm of the coefficient sequence as is common, but by reweighting the ℓ1 norm of the transformed object. An immediate consequence is the possibility of highly efficient data acquisition protocols by improving on a technique known as compressed sensing.
An introduction to nonlinear dimensionality reduction by maximum variance unfolding
 Unfolding, Proceedings of the 21st National Conference on Artificial Intelligence
, 2006
"... ..."
Colored maximum variance unfolding
, 2007
"... Maximum variance unfolding (MVU) is an effective heuristic for dimensionality reduction. It produces a lowdimensional representation of the data by maximizing the variance of their embeddings while preserving the local distances of the original data. We show that MVU also optimizes a statistical de ..."
Abstract

Cited by 25 (2 self)
 Add to MetaCart
Maximum variance unfolding (MVU) is an effective heuristic for dimensionality reduction. It produces a lowdimensional representation of the data by maximizing the variance of their embeddings while preserving the local distances of the original data. We show that MVU also optimizes a statistical dependence measure which aims to retain the identity of individual observations under the distancepreserving constraints. This general view allows us to design “colored” variants of MVU, which produce lowdimensional representations for a given task, e.g. subject to class labels or other side information.
Graph Laplacian Regularization for LargeScale . . .
, 2006
"... In many areas of science and engineering, the problem arises how to discover low dimensional representations of high dimensional data. Recently, a number of researchers have converged on common solutions to this problem using methods from convex optimization. In particular, many results have been ob ..."
Abstract

Cited by 24 (2 self)
 Add to MetaCart
In many areas of science and engineering, the problem arises how to discover low dimensional representations of high dimensional data. Recently, a number of researchers have converged on common solutions to this problem using methods from convex optimization. In particular, many results have been obtained by constructing semidefinite programs (SDPs) with low rank solutions. While the rank of matrix variables in SDPs cannot be directly constrained, it has been observed that low rank solutions emerge naturally by computing high variance or maximal trace solutions that respect local distance constraints. In this paper, we show how to solve very large problems of this type by a matrix factorization that leads to much smaller SDPs than those previously studied. The matrix factorization is derived by expanding the solution of the original problem in terms of the bottom eigenvectors of a graph Laplacian. The smaller SDPs obtained from this matrix factorization yield very good approximations to solutions of the original problem. Moreover, these approximations can be further refined by conjugate gradient descent. We illustrate the approach on localization in large scale sensor networks, where optimizations involving tens of thousands of nodes can be solved in just a few minutes.
Convex optimization of graph Laplacian eigenvalues
 in International Congress of Mathematicians
"... Abstract. We consider the problem of choosing the edge weights of an undirected graph so as to maximize or minimize some function of the eigenvalues of the associated Laplacian matrix, subject to some constraints on the weights, such as nonnegativity, or a given total value. In many interesting case ..."
Abstract

Cited by 23 (0 self)
 Add to MetaCart
Abstract. We consider the problem of choosing the edge weights of an undirected graph so as to maximize or minimize some function of the eigenvalues of the associated Laplacian matrix, subject to some constraints on the weights, such as nonnegativity, or a given total value. In many interesting cases this problem is convex, i.e., it involves minimizing a convex function (or maximizing a concave function) over a convex set. This allows us to give simple necessary and sufficient optimality conditions, derive interesting dual problems, find analytical solutions in some cases, and efficiently compute numerical solutions in all cases. In this overview we briefly describe some more specific cases of this general problem, which have been addressed in a series of recent papers. • Fastest mixing Markov chain. Find edge transition probabilities that give the fastest mixing (symmetric, discretetime) Markov chain on the graph. • Fastest mixing Markov process. Find the edge transition rates that give the fastest mixing (symmetric, continuoustime) Markov process on the graph. • Absolute algebraic connectivity. Find edge weights that maximize the algebraic
Mathematical aspects of mixing times in markov chains
 FOUND. TRENDS THEOR. COMPUT. SCI
, 2006
"... ..."
Minimizing effective resistance of a graph
 SIAM Review
, 2005
"... Abstract. The effective resistance between two nodes of a weighted graph is the electrical resistance seen between the nodes of a resistor network with branch conductances given by the edge weights. The effective resistance comes up in many applications and fields in addition to electrical network a ..."
Abstract

Cited by 15 (4 self)
 Add to MetaCart
Abstract. The effective resistance between two nodes of a weighted graph is the electrical resistance seen between the nodes of a resistor network with branch conductances given by the edge weights. The effective resistance comes up in many applications and fields in addition to electrical network analysis, including, for example, Markov chains and continuoustime averaging networks. In this paper we study the problem of allocating edge weights on a given graph in order to minimize the total effective resistance, i.e., the sum of the resistances between all pairs of nodes. We show that this is a convex optimization problem, and can be solved efficiently either numerically, or, in some cases, analytically. We show that optimal allocation of the edge weights can reduce the total effective resistance of the graph (compared to uniform weights) by a factor that grows unboundedly with the size of the graph. We show that among all graphs with n nodes, the path has the largest value of optimal total effective resistance, and the complete graph the least. 1. Introduction. Let N be a network with n nodes and m edges, i.e., an undirected graph (V, E) with V  = n, E  = m, and nonnegative weights on the edges. We call the weight on edge l its conductance, and denote it by gl. The effective resistance between a pair of nodes i and j, denoted Rij, is the electrical resistance measured across nodes i and j, when the network represents an electrical circuit with each edge (or branch, in the terminology of electrical circuits) a resistor with (electrical) conductance gl. In other