Results 1  10
of
76
Consistency of spectral clustering
, 2004
"... Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spe ..."
Abstract

Cited by 286 (15 self)
 Add to MetaCart
Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spectral clustering algorithms, which cluster the data with the help of eigenvectors of graph Laplacian matrices. We show that one of the two of major classes of spectral clustering (normalized clustering) converges under some very general conditions, while the other (unnormalized), is only consistent under strong additional assumptions, which, as we demonstrate, are not always satisfied in real data. We conclude that our analysis provides strong evidence for the superiority of normalized spectral clustering in practical applications. We believe that methods used in our analysis will provide a basis for future exploration of Laplacianbased methods in a statistical setting.
Locality Preserving Projections
, 2002
"... Many problems in information processing involve some form of dimensionality reduction. In this paper, we introduce Locality Preserving Projections (LPP). These are linear projective maps that arise by solving a variational problem that optimally preserves the neighborhood structure of the data s ..."
Abstract

Cited by 209 (15 self)
 Add to MetaCart
Many problems in information processing involve some form of dimensionality reduction. In this paper, we introduce Locality Preserving Projections (LPP). These are linear projective maps that arise by solving a variational problem that optimally preserves the neighborhood structure of the data set. LPP should be seen as an alternative to Principal Component Analysis (PCA)  a classical linear technique that projects the data along the directions of maximal variance. When the high dimensional data lies on a low dimensional manifold embedded in the ambient space, the Locality Preserving Projections are obtained by finding the optimal linear approximations to the eigenfunctions of the Laplace Beltrami operator on the manifold. As a result, LPP shares many of the data representation properties of nonlinear techniques such as Laplacian Eigenmaps or Locally Linear Embedding. Yet LPP is linear and more crucially is defined everywhere in ambient space rather than just on the training data points. This is borne out by illustrative examples on some high dimensional data sets.
The effect of network topology on the spread of epidemics
 IN IEEE INFOCOM
, 2005
"... Many network phenomena are well modeled as spreads of epidemics through a network. Prominent examples include the spread of worms and email viruses, and, more generally, faults. Many types of information dissemination can also be modeled as spreads of epidemics. In this paper we address the question ..."
Abstract

Cited by 115 (8 self)
 Add to MetaCart
Many network phenomena are well modeled as spreads of epidemics through a network. Prominent examples include the spread of worms and email viruses, and, more generally, faults. Many types of information dissemination can also be modeled as spreads of epidemics. In this paper we address the question of what makes an epidemic either weak or potent. More precisely, we identify topological properties of the graph that determine the persistence of epidemics. In particular, we show that if the ratio of cure to infection rates is smaller than the spectral radius of the graph, then the mean epidemic lifetime is of order log n, where n is the number of nodes. Conversely, if this ratio is bigger than a generalization of the isoperimetric constant of the graph, then the mean epidemic lifetime is of order � Ò�, for a positive constant �. We apply these results to several network topologies including the hypercube, which is a representative connectivity graph for a distributed hash table, the complete graph, which is an important connectivity graph for BGP, and the power law graph, of which the ASlevel Internet graph is a prime example. We also study the star topology and the ErdősRényi graph as their epidemic spreading behaviors determine the spreading behavior of power law graphs.
Distributed average consensus with leastmeansquare deviation
 Journal of Parallel and Distributed Computing
, 2005
"... We consider a stochastic model for distributed average consensus, which arises in applications such as load balancing for parallel processors, distributed coordination of mobile autonomous agents, and network synchronization. In this model, each node updates its local variable with a weighted averag ..."
Abstract

Cited by 82 (5 self)
 Add to MetaCart
We consider a stochastic model for distributed average consensus, which arises in applications such as load balancing for parallel processors, distributed coordination of mobile autonomous agents, and network synchronization. In this model, each node updates its local variable with a weighted average of its neighbors ’ values, and each new value is corrupted by an additive noise with zero mean. The quality of consensus can be measured by the total meansquare deviation of the individual variables from their average, which converges to a steadystate value. We consider the problem of finding the (symmetric) edge weights that result in the least meansquare deviation in steady state. We show that this problem can be cast as a convex optimization problem, so the global solution can be found efficiently. We describe some computational methods for solving this problem, and compare the weights and the meansquare deviations obtained by this method and several other weight design methods.
GraphDriven Features Extraction From Microarray Data
, 2002
"... Gene function prediction from microarray data is a first step toward better understanding the machinery of the cell from relatively cheap and easytoproduce data. In this paper we investigate whether the knowledge of many metabolic pathways and their catalyzing enzymes accumulated over the years ca ..."
Abstract

Cited by 43 (3 self)
 Add to MetaCart
Gene function prediction from microarray data is a first step toward better understanding the machinery of the cell from relatively cheap and easytoproduce data. In this paper we investigate whether the knowledge of many metabolic pathways and their catalyzing enzymes accumulated over the years can help improve the performance of classifiers for this problem.
Graph edit distance from spectral seriation
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2005
"... Abstract—This paper is concerned with computing graph edit distance. One of the criticisms that can be leveled at existing methods for computing graph edit distance is that they lack some of the formality and rigor of the computation of string edit distance. Hence, our aim is to convert graphs to st ..."
Abstract

Cited by 34 (6 self)
 Add to MetaCart
Abstract—This paper is concerned with computing graph edit distance. One of the criticisms that can be leveled at existing methods for computing graph edit distance is that they lack some of the formality and rigor of the computation of string edit distance. Hence, our aim is to convert graphs to string sequences so that string matching techniques can be used. To do this, we use a graph spectral seriation method to convert the adjacency matrix into a string or sequence order. We show how the serial ordering can be established using the leading eigenvector of the graph adjacency matrix. We pose the problem of graphmatching as a maximum a posteriori probability (MAP) alignment of the seriation sequences for pairs of graphs. This treatment leads to an expression in which the edit cost is the negative logarithm of the a posteriori sequence alignment probability. We compute the edit distance by finding the sequence of string edit operations which minimizes the cost of the path traversing the edit lattice. The edit costs are determined by the components of the leading eigenvectors of the adjacency matrix and by the edge densities of the graphs being matched. We demonstrate the utility of the edit distance on a number of graph clustering problems. Index Terms—Graph edit distance, graph seriation, maximum a posteriori probability (MAP), graphspectral methods. 1
Visualization of Bibliographic Networks with a Reshaped Landscape Metaphor
 PROC. 4TH JOINT EUROGRAPHICS  IEEE TVCG SYMP. VISUALIZATION (VISSYM ’02
, 2002
"... We describe a novel approach to visualize bibliographic networks that facilitates the simultaneous identification of clusters (e.g., topic areas) and prominent entities (e.g., surveys or landmark papers). While employing the landscape metaphor proposed in several earlier works, we introduce new mean ..."
Abstract

Cited by 26 (5 self)
 Add to MetaCart
We describe a novel approach to visualize bibliographic networks that facilitates the simultaneous identification of clusters (e.g., topic areas) and prominent entities (e.g., surveys or landmark papers). While employing the landscape metaphor proposed in several earlier works, we introduce new means to determine relevant parameters of the landscape. Moreover, we are able to compute prominent entities, clustering of entities, and the landscape's surface in a surprisingly simple and uniform way. The effectiveness of our network visualizations is illustrated on data from the graph drawing literature.
Harmonic analysis on metrized graphs
 Canad. J. Math
"... Abstract. This paper studies the Laplacian operator on a metrized graph, and its spectral theory. 1. ..."
Abstract

Cited by 24 (4 self)
 Add to MetaCart
Abstract. This paper studies the Laplacian operator on a metrized graph, and its spectral theory. 1.
Visual Ranking of Link Structures
 Journal of Graph Algorithms and Applications
, 2003
"... Methods for ranking World Wide Web resources according to their position in the link structure of the Web are receiving considerable attention, because they provide the first e#ective means for search engines to cope with the explosive growth and diversification of the Web. Closely related metho ..."
Abstract

Cited by 21 (4 self)
 Add to MetaCart
Methods for ranking World Wide Web resources according to their position in the link structure of the Web are receiving considerable attention, because they provide the first e#ective means for search engines to cope with the explosive growth and diversification of the Web. Closely related methods have been used in other disciplines for quite some time.
Discrete Nodal Domain Theorems
, 2000
"... We give a detailed proof for two discrete analogues of Courant's Nodal Domain Theorem. 1 Introduction Courant's famous Nodal Domain Theorem for elliptic operators on Riemannian manifolds (see e.g. [1]) states If f k is an eigenfunction belonging to the kth eigenvalue (written in increasing order ..."
Abstract

Cited by 21 (7 self)
 Add to MetaCart
We give a detailed proof for two discrete analogues of Courant's Nodal Domain Theorem. 1 Introduction Courant's famous Nodal Domain Theorem for elliptic operators on Riemannian manifolds (see e.g. [1]) states If f k is an eigenfunction belonging to the kth eigenvalue (written in increasing order and counting multiplicities) of an elliptic operator, then f k has at most k nodal domains. When considering the analogous problem for graphs, M. Fiedler [4, 5] noticed that the second Laplacian eigenvalue is closely related to connectivity properties of the graph, and showed that f 2 always has exactly two nodal domains. It 13 September 2000 is interesting to note that his approach can be extended to show that f k has no more than 2(k 1) nodal domains, k 2 [7]. Various discrete versions of the Nodal Domain theorem have been discussed in the literature [2, 6, 8, 3], however sometimes with ambiguous statements and incomplete or awed proofs. The purpose of this contribution is not to esta...