Results 1 - 10
of
48
On Clusterings: Good, Bad and Spectral
, 2000
"... We motivate and develop a natural bicriteria measure for assessing the quality of a clustering which avoids the drawbacks of existing measures. A simple recursive heuristic has poly-logarithmic worst-case guarantees under the new measure. The main result of the paper is the analysis of a popular spe ..."
Abstract
-
Cited by 203 (10 self)
- Add to MetaCart
We motivate and develop a natural bicriteria measure for assessing the quality of a clustering which avoids the drawbacks of existing measures. A simple recursive heuristic has poly-logarithmic worst-case guarantees under the new measure. The main result of the paper is the analysis of a popular spectral algorithm. One variant of spectral clustering turns out to have effective worst-case guarantees
Consistency of spectral clustering
, 2004
"... Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spe ..."
Abstract
-
Cited by 170 (11 self)
- Add to MetaCart
Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spectral clustering algorithms, which cluster the data with the help of eigenvectors of graph Laplacian matrices. We show that one of the two of major classes of spectral clustering (normalized clustering) converges under some very general conditions, while the other (unnormalized), is only consistent under strong additional assumptions, which, as we demonstrate, are not always satisfied in real data. We conclude that our analysis provides strong evidence for the superiority of normalized spectral clustering in practical applications. We believe that methods used in our analysis will provide a basis for future exploration of Laplacian-based methods in a statistical setting.
A spectral clustering approach to finding communities in graphs
- In SIAM International Conference on Data Mining
, 2005
"... Clustering nodes in a graph is a useful general technique in data mining of large network data sets. In this context, Newman and Girvan [9] recently proposed an objective function for graph clustering called the Q function which allows automatic selection of the number of clusters. Empirically, high ..."
Abstract
-
Cited by 60 (0 self)
- Add to MetaCart
Clustering nodes in a graph is a useful general technique in data mining of large network data sets. In this context, Newman and Girvan [9] recently proposed an objective function for graph clustering called the Q function which allows automatic selection of the number of clusters. Empirically, higher values of the Q function have been shown to correlate well with good graph clusterings. In this paper we show how optimizing the Q function can be reformulated as a spectral relaxation problem and propose two new spectral clustering algorithms that seek to maximize Q. Experimental results indicate that the new algorithms are efficient and effective at finding both good clusterings and the appropriate number of clusters across a variety of real-world graph data sets. In addition, the spectral algorithms are much faster for large sparse graphs, scaling roughly linearly with the number of nodes n in the graph, compared to O(n 2) for previous clustering algorithms using the Q function. 1
VLSI Circuit Partitioning by Cluster-Removal using Iterative Improvement Techniques
- Proc. IEEE International Conference on Computer-Aided Design
, 1996
"... Move-based iterative improvement partitioning methods such as the Fiduccia-Mattheyses (FM) algorithm [3] and Krishnamurthy's Look-Ahead (LA) algorithm [4] are widely used in VLSI CAD applications largely due to their time efficiency and ease of implementation. This class of algorithms is of the "loc ..."
Abstract
-
Cited by 50 (6 self)
- Add to MetaCart
Move-based iterative improvement partitioning methods such as the Fiduccia-Mattheyses (FM) algorithm [3] and Krishnamurthy's Look-Ahead (LA) algorithm [4] are widely used in VLSI CAD applications largely due to their time efficiency and ease of implementation. This class of algorithms is of the "local improvement" type. They generate relatively high quality results for small and medium size circuits. However, as VLSI circuits become larger, these algorithms are not so effective on them as direct partitioning tools. We propose new iterative-improvement methods that select cells to move with a view to moving clusters that straddle the two subsets of a partition into one of the subsets. The new algorithms significantly improve partition quality while preserving the advantage of time efficiency. Experimental results on 25 medium to large size ACM/SIGDA benchmark circuits show up to 70% improvement over FM in cutsize, with an average of per-circuit percent improvements of about 25%, and a t...
A Unifying Theorem for Spectral Embedding and Clustering
, 2003
"... Spectral methods use selected eigenvectors of a data affinity matrix to obtain a data representation that can be trivially clustered or embedded in a low-dimensional space. We present a theorem that explains, for broad classes of affinity matrices and eigenbases, why this works: For successive ..."
Abstract
-
Cited by 45 (0 self)
- Add to MetaCart
Spectral methods use selected eigenvectors of a data affinity matrix to obtain a data representation that can be trivially clustered or embedded in a low-dimensional space. We present a theorem that explains, for broad classes of affinity matrices and eigenbases, why this works: For successively smaller eigenbases (i.e., using fewer and fewer of the affinity matrix's dominant eigenvalues and eigenvectors), the angles between "similar" vectors in the new representation shrink while the angles between "dissimilar" vectors grow. Specifically, the sum of the squared cosines of the angles is strictly increasing as the dimensionality of the representation decreases. Thus spectral methods work because the truncated eigenbasis amplifies structure in the data so that any heuristic post-processing is more likely to succeed. We use this result to construct a nonlinear dimensionality reduction (NLDR) algorithm for data sampled from manifolds whose intrinsic coordinate system has linear and cyclic axes, and a novel clustering-by-projections algorithm that requires no post-processing and gives superior performance on "challenge problems" from the recent literature.
A hybrid multilevel/genetic approach for circuit partitioning
- IN PROC. ACM/SIGDA PHYSICAL DESIGN WORKSHOP
, 1996
"... We present a genetic circuit partitioning algorithm that integrates the Metis graph partitioning package [15] originally designed for sparse matrix computations. Metis is an extremely fast iterative partitioner that uses multilevel clustering. We have adapted Metis to partition circuit netlists, and ..."
Abstract
-
Cited by 41 (7 self)
- Add to MetaCart
We present a genetic circuit partitioning algorithm that integrates the Metis graph partitioning package [15] originally designed for sparse matrix computations. Metis is an extremely fast iterative partitioner that uses multilevel clustering. We have adapted Metis to partition circuit netlists, and have applied a genetic technique that uses previous Metis solutions to help construct new Metis solutions. Our hybrid technique produces better results than Metis alone, and also produces bipartitionings that are competitive with previous methods [20] [18] [6] while using less CPU time.
Probability-Based Approaches to VLSI Circuit Partitioning
, 2000
"... Iterative-improvement 2-way min-cut partitioning is an important phase in most circuit placement tools, and finds use in many other CAD applications. Most iterative improvement techniques for circuit netlists like the FiducciaMattheyses (FM) method compute the gains of nodes using local netlist info ..."
Abstract
-
Cited by 38 (7 self)
- Add to MetaCart
Iterative-improvement 2-way min-cut partitioning is an important phase in most circuit placement tools, and finds use in many other CAD applications. Most iterative improvement techniques for circuit netlists like the FiducciaMattheyses (FM) method compute the gains of nodes using local netlist information that is only concerned with the immediate improvement in the cutset. This can lead to misleading gain information. Krishnamurthy suggested a lookahead (LA) gain calculation method to ameliorate this situation; however, as we show, it leaves room for improvement. We present here a probabilistic gain computation approach called PROP (PRObabilistic Partitioner) that is capable of capturing the future implications of moving a node at the current time. We also propose an extended algorithm SHRINK-PROP that increases the probability of removing recently "perturbed" nets (nets whose nodes have been moved for the first time) from the cutset. This is necessary, since in a regular move process, the removal probabilities of most nets either remain unchanged or even decrease when their nodes are moved for the first time. Experimental results on medium- to large-size ACM/SIGDA benchmark circuits show that PROP and SHRINK-PROP outperform previous iterative-improvement methods like FM (by about 30% and 37%, respectively) and LA (by about 27% and 34%, respectively). Both PROP and SHRINK-PROP also obtain much better cutsizes than many recent state-of-the-art partitioners like EIG1, WINDOW, MELO, PARABOLI, GFM and GMetis (by 4.5% to 67%). We also show that the space and time complexities of PROP and SHRINK-PROP are very reasonable. Our empirical timing results reveal that PROP is appreciably faster than all recent techniques except GMetis---all other partitioners including ours work on...
Large Scale Circuit Partitioning with Loose/Stable Net Removal and Signal Flow Based Clustering
- In Proc. Int. Conf. on Computer-Aided Design
, 1997
"... this paper, we present an efficient Iterative Improvement based Partitioning (IIP) algorithm called LSR/MFFS, that combines signal flow based Maximum Fanout Free Subgraph (MFFS) clustering algorithm with Loose and Stable net Removal (LSR) partitioning algorithm. The MFFS algorithm generalizes existi ..."
Abstract
-
Cited by 29 (9 self)
- Add to MetaCart
this paper, we present an efficient Iterative Improvement based Partitioning (IIP) algorithm called LSR/MFFS, that combines signal flow based Maximum Fanout Free Subgraph (MFFS) clustering algorithm with Loose and Stable net Removal (LSR) partitioning algorithm. The MFFS algorithm generalizes existing MFFC decomposition method from combinational circuits to general sequential circuits in order to handle cycles naturally. We also study the properties of the nets that straddle the cutline carefully, and introduce the concepts of the loose and stable nets as well as effective ways to remove them out of the cutset. The LSR/MFFS algorithm first applies LSR algorithm to clustered netlist generated by MFFS algorithm for global-level cutsize optimization and then declusters netlist for further cutsize refinement. As a result, the LSR/MFFS algorithm has achieved the best cutsize result among all the bipartitioning algorithms published in the literatures with very promising runtime performance. In particular, it outperforms the recent state-ofthe -art IIP algorithms LA3-CDIP, CLIP-PROP f [8], Strawman [12], hMetis-FM [13], and MLc [1] by 17.4%, 12.1%, 5.9%, 3.1%, and 1.9%, respectively. It also outperforms the state-of-the-art non-IIP algorithms Paraboli [17], FBB [19], and PANZA [16] by 32.0%, 21.4%, and 1.4%, respectively.
Multi-Label Image Segmentation for Medical Applications Based on Graph-Theoretic Electrical Potentials
- ECCV
, 2004
"... Abstract. A novel method is proposed for performing multi-label, semi-automated image segmentation. Given a small number of pixels with user-defined labels, one can analytically (and quickly) determine the probability that a random walker starting at each unlabeled pixel will first reach one of the ..."
Abstract
-
Cited by 28 (9 self)
- Add to MetaCart
Abstract. A novel method is proposed for performing multi-label, semi-automated image segmentation. Given a small number of pixels with user-defined labels, one can analytically (and quickly) determine the probability that a random walker starting at each unlabeled pixel will first reach one of the pre-labeled pixels. By assigning each pixel to the label for which the greatest probability is calculated, a high-quality image segmentation may be obtained. Theoretical properties of this algorithm are developed along with the corresponding connections to discrete potential theory and electrical circuits. This algorithm is formulated in discrete space (i.e., on a graph) using combinatorial analogues of standard operators and principles from continuous potential theory, allowing it to be applied in arbitrary dimension. 1

