Results 1  10
of
93
Consistency of spectral clustering
, 2004
"... Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spe ..."
Abstract

Cited by 286 (15 self)
 Add to MetaCart
Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spectral clustering algorithms, which cluster the data with the help of eigenvectors of graph Laplacian matrices. We show that one of the two of major classes of spectral clustering (normalized clustering) converges under some very general conditions, while the other (unnormalized), is only consistent under strong additional assumptions, which, as we demonstrate, are not always satisfied in real data. We conclude that our analysis provides strong evidence for the superiority of normalized spectral clustering in practical applications. We believe that methods used in our analysis will provide a basis for future exploration of Laplacianbased methods in a statistical setting.
Locality Preserving Projections
, 2002
"... Many problems in information processing involve some form of dimensionality reduction. In this paper, we introduce Locality Preserving Projections (LPP). These are linear projective maps that arise by solving a variational problem that optimally preserves the neighborhood structure of the data s ..."
Abstract

Cited by 209 (15 self)
 Add to MetaCart
Many problems in information processing involve some form of dimensionality reduction. In this paper, we introduce Locality Preserving Projections (LPP). These are linear projective maps that arise by solving a variational problem that optimally preserves the neighborhood structure of the data set. LPP should be seen as an alternative to Principal Component Analysis (PCA)  a classical linear technique that projects the data along the directions of maximal variance. When the high dimensional data lies on a low dimensional manifold embedded in the ambient space, the Locality Preserving Projections are obtained by finding the optimal linear approximations to the eigenfunctions of the Laplace Beltrami operator on the manifold. As a result, LPP shares many of the data representation properties of nonlinear techniques such as Laplacian Eigenmaps or Locally Linear Embedding. Yet LPP is linear and more crucially is defined everywhere in ambient space rather than just on the training data points. This is borne out by illustrative examples on some high dimensional data sets.
Statistical properties of community structure in large social and information networks
"... A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its members than to the remainder of the network. In this paper, we characterize as a function of size the statistical and structur ..."
Abstract

Cited by 120 (10 self)
 Add to MetaCart
A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its members than to the remainder of the network. In this paper, we characterize as a function of size the statistical and structural properties of such sets of nodes. We define the network community profile plot, which characterizes the “best ” possible community—according to the conductance measure—over a wide range of size scales, and we study over 70 large sparse realworld networks taken from a wide range of application domains. Our results suggest a significantly more refined picture of community structure in large realworld networks than has been appreciated previously. Our most striking finding is that in nearly every network dataset we examined, we observe tight but almost trivial communities at very small scales, and at larger size scales, the best possible communities gradually “blend in ” with the rest of the network and thus become less “communitylike.” This behavior is not explained, even at a qualitative level, by any of the commonlyused network generation models. Moreover, this behavior is exactly the opposite of what one would expect based on experience with and intuition from expander graphs, from graphs that are wellembeddable in a lowdimensional structure, and from small social networks that have served as testbeds of community detection algorithms. We have found, however, that a generative model, in which new edges are added via an iterative “forest fire” burning process, is able to produce graphs exhibiting a network community structure similar to our observations.
Community structure in large networks: Natural cluster sizes and the absence of large welldefined clusters
, 2008
"... A large body of work has been devoted to defining and identifying clusters or communities in social and information networks, i.e., in graphs in which the nodes represent underlying social entities and the edges represent some sort of interaction between pairs of nodes. Most such research begins wit ..."
Abstract

Cited by 79 (6 self)
 Add to MetaCart
A large body of work has been devoted to defining and identifying clusters or communities in social and information networks, i.e., in graphs in which the nodes represent underlying social entities and the edges represent some sort of interaction between pairs of nodes. Most such research begins with the premise that a community or a cluster should be thought of as a set of nodes that has more and/or better connections between its members than to the remainder of the network. In this paper, we explore from a novel perspective several questions related to identifying meaningful communities in large social and information networks, and we come to several striking conclusions. Rather than defining a procedure to extract sets of nodes from a graph and then attempt to interpret these sets as a “real ” communities, we employ approximation algorithms for the graph partitioning problem to characterize as a function of size the statistical and structural properties of partitions of graphs that could plausibly be interpreted as communities. In particular, we define the network community profile plot, which characterizes the “best ” possible community—according to the conductance measure—over a wide range of size scales. We study over 100 large realworld networks, ranging from traditional and online social networks, to technological and information networks and
Variational Restoration Of Nonflat Image Features: Models And Algorithms
, 2000
"... We develop both mathematical models and computational algorithms for variational denoising and restoration of nonflat image features. Nonflat image features are those that live on Riemannian manifolds, instead of on the Euclidean spaces. Familiar examples include the orientation feature (from optica ..."
Abstract

Cited by 77 (14 self)
 Add to MetaCart
We develop both mathematical models and computational algorithms for variational denoising and restoration of nonflat image features. Nonflat image features are those that live on Riemannian manifolds, instead of on the Euclidean spaces. Familiar examples include the orientation feature (from optical flows or gradient flows) that lives on the unit circle S¹, the alignment feature (from fingerprint waves or certain texture images) that lives on the real projective line RP¹ and the chromaticity feature (from color images) that lives on the unit sphere S². In this paper, we apply the variational method to denoise and restore general nonflat image features. Mathematical models for both continuous image domains and discrete domains (or graphs) are constructed. Riemannian objects such as metric, distance and LeviCivita connection play important roles in the models. Computational algorithms are also developed for the resulting nonlinear equations. The mathematical framework can be applied to restoring general nonflat data outside the scope of image processing and computer vision.
Empirical comparison of algorithms for network community detection
 In Proc. WWW’10
, 2010
"... Detecting clusters or communities in large realworld graphs such as large social or information networks is a problem of considerable interest. In practice, one typically chooses an objective function that captures the intuition of a network cluster as set of nodes with better internal connectivity ..."
Abstract

Cited by 61 (4 self)
 Add to MetaCart
Detecting clusters or communities in large realworld graphs such as large social or information networks is a problem of considerable interest. In practice, one typically chooses an objective function that captures the intuition of a network cluster as set of nodes with better internal connectivity than external connectivity, and then one applies approximation algorithms or heuristics to extract sets of nodes that are related to the objective function and that “look like” good communities for the application of interest. In this paper, we explore a range of network community detection methods in order to compare them and to understand their relative performance and the systematic biases in the clusters they identify. We evaluate several common objective functions that are used to formalize the notion of a network community, and we examine several different classes of approximation algorithms that aim to optimize such objective functions. In addition, rather than simply fixing an objective and asking for an approximation to the best cluster of any size, we consider a sizeresolved version of the optimization problem. Considering community quality as a function of its size provides a much finer lens with which to examine community detection algorithms, since objective functions and approximation algorithms often have nonobvious sizedependent behavior.
Graph Laplacians and Stabilization of Vehicle Formations
, 2001
"... Control of vehicle formations has emerged as a topic of significant interest to the controls community. In this paper, we merge tools from graph theory and control theory to derive stability criteria for formation stabilization. The interconnection between vehicles (i.e., which vehicles are sensed b ..."
Abstract

Cited by 58 (5 self)
 Add to MetaCart
Control of vehicle formations has emerged as a topic of significant interest to the controls community. In this paper, we merge tools from graph theory and control theory to derive stability criteria for formation stabilization. The interconnection between vehicles (i.e., which vehicles are sensed by other vehicles) is modeled as a graph, and the eigenvalues of the Laplacian matrix of the graph are used in stating a Nyquistlike stability criterion for vehicle formations. The location of the Laplacian eigenvalues can be correlated to the graph structure, and therefore used to identify desirable and undesirable formation interconnection topologies.
A Unifying Theorem for Spectral Embedding and Clustering
, 2003
"... Spectral methods use selected eigenvectors of a data affinity matrix to obtain a data representation that can be trivially clustered or embedded in a lowdimensional space. We present a theorem that explains, for broad classes of affinity matrices and eigenbases, why this works: For successive ..."
Abstract

Cited by 55 (0 self)
 Add to MetaCart
Spectral methods use selected eigenvectors of a data affinity matrix to obtain a data representation that can be trivially clustered or embedded in a lowdimensional space. We present a theorem that explains, for broad classes of affinity matrices and eigenbases, why this works: For successively smaller eigenbases (i.e., using fewer and fewer of the affinity matrix's dominant eigenvalues and eigenvectors), the angles between "similar" vectors in the new representation shrink while the angles between "dissimilar" vectors grow. Specifically, the sum of the squared cosines of the angles is strictly increasing as the dimensionality of the representation decreases. Thus spectral methods work because the truncated eigenbasis amplifies structure in the data so that any heuristic postprocessing is more likely to succeed. We use this result to construct a nonlinear dimensionality reduction (NLDR) algorithm for data sampled from manifolds whose intrinsic coordinate system has linear and cyclic axes, and a novel clusteringbyprojections algorithm that requires no postprocessing and gives superior performance on "challenge problems" from the recent literature.
GraphDriven Features Extraction From Microarray Data
, 2002
"... Gene function prediction from microarray data is a first step toward better understanding the machinery of the cell from relatively cheap and easytoproduce data. In this paper we investigate whether the knowledge of many metabolic pathways and their catalyzing enzymes accumulated over the years ca ..."
Abstract

Cited by 43 (3 self)
 Add to MetaCart
Gene function prediction from microarray data is a first step toward better understanding the machinery of the cell from relatively cheap and easytoproduce data. In this paper we investigate whether the knowledge of many metabolic pathways and their catalyzing enzymes accumulated over the years can help improve the performance of classifiers for this problem.
Semisupervised discriminant analysis
 in Proc. of the IEEE Int’l Conf. on Comp. Vision (ICCV), Rio De Janeiro
, 2007
"... Linear Discriminant Analysis (LDA) has been a popular method for extracting features which preserve class separability. The projection vectors are commonly obtained by maximizing the between class covariance and simultaneously minimizing the within class covariance. In practice, when there is no suf ..."
Abstract

Cited by 42 (2 self)
 Add to MetaCart
Linear Discriminant Analysis (LDA) has been a popular method for extracting features which preserve class separability. The projection vectors are commonly obtained by maximizing the between class covariance and simultaneously minimizing the within class covariance. In practice, when there is no sufficient training samples, the covariance matrix of each class may not be accurately estimated. In this paper, we propose a novel method, called Semisupervised Discriminant Analysis (SDA), which makes use of both labeled and unlabeled samples. The labeled data points are used to maximize the separability between different classes and the unlabeled data points are used to estimate the intrinsic geometric structure of the data. Specifically, we aim to learn a discriminant function which is as smooth as possible on the data manifold. Experimental results on single training image face recognition and relevance feedback image retrieval demonstrate the effectiveness of our algorithm. 1.