Results 1  10
of
34
Normalized Cuts and Image Segmentation
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2000
"... ..."
Consistency of spectral clustering
, 2004
"... Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spe ..."
Abstract

Cited by 287 (15 self)
 Add to MetaCart
Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spectral clustering algorithms, which cluster the data with the help of eigenvectors of graph Laplacian matrices. We show that one of the two of major classes of spectral clustering (normalized clustering) converges under some very general conditions, while the other (unnormalized), is only consistent under strong additional assumptions, which, as we demonstrate, are not always satisfied in real data. We conclude that our analysis provides strong evidence for the superiority of normalized spectral clustering in practical applications. We believe that methods used in our analysis will provide a basis for future exploration of Laplacianbased methods in a statistical setting.
A Minmax Cut Algorithm for Graph Partitioning and Data Clustering
, 2001
"... An important application of graph partitioning is data clustering using a graph model  the pairwise similarities between all data objects form a weighted graph adjacency matrix that contains all necessary information for clustering. Here we propose a new algorithm for graph partition with an object ..."
Abstract

Cited by 152 (12 self)
 Add to MetaCart
An important application of graph partitioning is data clustering using a graph model  the pairwise similarities between all data objects form a weighted graph adjacency matrix that contains all necessary information for clustering. Here we propose a new algorithm for graph partition with an objective function that follows the minmax clustering principle. The relaxed version of the optimization of the minmax cut objective function leads to the Fiedler vector in spectral graph partition. Theoretical analyses of minmax cut indicate that it leads to balanced partitions, and lower bonds are derived. The minmax cut algorithm is tested on newsgroup datasets and is found to outperform other current popular partitioning/clustering methods. The linkagebased re nements in the algorithm further improve the quality of clustering substantially. We also demonstrate that the linearized search order based on linkage di erential is better than that based on the Fiedler vector, providing another e ective partition method.
Motion Segmentation and Tracking Using Normalized Cuts
, 1998
"... We propose a motion segmentation algorithm that aims to break a scene into its most prominent moving groups. A weighted graph is constructed on the ira. age sequence by connecting pixels that arc in the spatiotemporal neighborhood of each other. At each pizel, we define motion profile vectors which ..."
Abstract

Cited by 146 (5 self)
 Add to MetaCart
We propose a motion segmentation algorithm that aims to break a scene into its most prominent moving groups. A weighted graph is constructed on the ira. age sequence by connecting pixels that arc in the spatiotemporal neighborhood of each other. At each pizel, we define motion profile vectors which capture the probability distribution of the image veloczty. The distance between motion profiles is used to assign a weight on the graph edges. 5rsmg normalized cuts we find the most salient partitions of the spatiotemporaI graph formed by the image sequence. For swmenting long image sequences,' we have developed a recursire update procedure that incorporates knowledge of segmentation in previous frames for efficiently finding the group correspondence in the new frame.
Bipartite graph partitioning and data clustering
 Proc. Int'l Conf. Information and Knowledge Management (CIKM
, 2001
"... ƒ ƒf „ … †D‡i ˆ ‰ ŠŒ ‹ Ž ˆ'†‘k ’ ‹ ‰ “ ”• ‹ †D–˜—& ’ ‰ Ž: ’ ™ ‹ š=›iœ3X( " F17 67 "! # ž " ƒI „ ¡R¢EŠ} ’ ‰ Ž£ “ ‹ šM¤XŽ£†D ’ ‹ ‰Q¥Mš ¦ ’ §m‰, ‹ ›¨œnu "! # ( = 6M+ ( "  p ..."
Abstract

Cited by 90 (14 self)
 Add to MetaCart
ƒ ƒf „ … †D‡i ˆ ‰ ŠŒ ‹ Ž ˆ'†‘k ’ ‹ ‰ “ ”• ‹ †D–˜—& ’ ‰ Ž: ’ ™ ‹ š=›iœ3X( " F17 67 "! # ž " ƒI „ ¡R¢EŠ} ’ ‰ Ž£ “ ‹ šM¤XŽ£†D ’ ‹ ‰Q¥Mš ¦ ’ §m‰, ‹ ›¨œnu "! # ( = 6M+ ( "  p
A Practical Approach to Dynamic Load Balancing
, 1995
"... algorithm for load balancing. The following sections elaborate on each step in the above algorithm, presenting various design decisions that one encounters. 2.1 Load Evaluation The efficacy of any load balancing scheme is directly dependent on the quality of load evaluation. Good load measurement i ..."
Abstract

Cited by 69 (7 self)
 Add to MetaCart
algorithm for load balancing. The following sections elaborate on each step in the above algorithm, presenting various design decisions that one encounters. 2.1 Load Evaluation The efficacy of any load balancing scheme is directly dependent on the quality of load evaluation. Good load measurement is necessary both to determine that a load imbalance exists and to calculate how much work should be transferred to alleviate that imbalance. One can determine the load associated with a given task analytically, empirically or by a combination of those two methods. 6 CHAPTER 2. METHODOLOGY 2.1.1 Analytic Load Evaluation The load for a task is estimated based on knowledge of the time complexity of the algorithm(s) that task is executing along with the data structures on which it is operating. For example, if one knew that a task involved merge sorting a list of 64 elements, one might estimate the load to be 384, since merge sort is an O(N log 2 N) sorting algorithm, and since 64 log 2 (64) ...
Parallel Decomposition of Unstructured FEMMeshes
 Concurrency: Practice & Experience
, 1995
"... . We present a massively parallel algorithm for static and dynamic partitioning of unstructured FEMmeshes. The method consists of two parts. First a fast but inaccurate sequential clustering is determined which is used, together with a simple mapping heuristic, to map the mesh initially onto the pr ..."
Abstract

Cited by 42 (15 self)
 Add to MetaCart
. We present a massively parallel algorithm for static and dynamic partitioning of unstructured FEMmeshes. The method consists of two parts. First a fast but inaccurate sequential clustering is determined which is used, together with a simple mapping heuristic, to map the mesh initially onto the processors of a massively parallel system. The second part of the method uses a massively parallel algorithm to remap and optimize the mesh decomposition taking several cost functions into account. It first calculates the amount of nodes that have to be migrated between pairs of clusters in order to obtain an optimal load balancing. In a second step, nodes to be migrated are chosen according to cost functions optimizing the amount and necessary communication and other measures which are important for the numerical solution method (like for example the aspect ratio of the resulting domains). The parallel parts of the method are implemented in C under Parix to run on the Parsytec GCel systems. R...
Parallel Structures and Dynamic Load Balancing for Adaptive Finite Element Computation
 Applied Numerical Mathematics
, 1996
"... this paper, we have focused on describing and comparing several load balancing schemes. Comparisons by timing are difficult, since times vary between runs having the same parameters. The highspeed switch of the IBM SP2 computer is a shared resource that affects run times. More subtle effects can re ..."
Abstract

Cited by 39 (12 self)
 Add to MetaCart
this paper, we have focused on describing and comparing several load balancing schemes. Comparisons by timing are difficult, since times vary between runs having the same parameters. The highspeed switch of the IBM SP2 computer is a shared resource that affects run times. More subtle effects can result from differences in the order in which messages used for migration are processed. Changes in the order in which those messages are received and integrated into the local MDB result in different traversal orders of the mesh entities. These differences cause small changes in load balancings and coarsenings. While such differences in meshes and partitionings do not affect the solution accuracy, they can cause sufficient changes in efficiency to make precise timings difficult. Qualitatively, PSIRB produced the best partitions (measured as a function of total analysis time). Octreegenerated partitions were comparable but resulted in slightly longer solution times. In both cases, one or two iterations of partition boundary smoothing led to a quality improvement. ITB by itself resulted in poorer partition quality, but is useful when mesh changes are small between computational stages. Predictive enrichment provided su21 perior performance to our current enrichment process with transient problems where there are frequent enrichment and balancing steps. Enhancements to the existing load balancing procedures and the implementation of new ones are under investigation. Improvements in the slicebyslice technique used by ITB for migration are necessary. Experiments with geometrical methods that use the spatial location of elements relative to the centroids of sending and receiving processors showed promise at reducing the number of processor interconnections. Vidwans et al. [39] pr...
Web Document Clustering Using Hyperlink Structures
, 2001
"... With the exponential growth of information on the World Wide Web, there is great demand for developing efficient and effective methods for organizing and retrieving the information available. Document clustering plays an important role in information retrieval and taxonomy management for the World W ..."
Abstract

Cited by 38 (5 self)
 Add to MetaCart
With the exponential growth of information on the World Wide Web, there is great demand for developing efficient and effective methods for organizing and retrieving the information available. Document clustering plays an important role in information retrieval and taxonomy management for the World Wide Web and remains an interesting and challenging problem in the field of web computing. In this paper we consider document clustering methods exploring textual information, hyperlink structure and cocitation relations. In particular, we apply the normalizedcut clustering method developed in computer vision to the task of hyperdocument clustering. We also explore some theoretical connections of the normalizedcut method to Kmeans method. We then experiment with normalizedcut method in the context of clustering query result sets for web search engines.
Dynamic LoadBalancing for Parallel Adaptive Unstructured Meshes
, 1997
"... A parallel method for dynamic partitioning of unstructured meshes is described. The method employs a new iterative optimisation technique which both balances the workload and attempts to minimise the interprocessor communications overhead. Experiments on a series of adaptively refined meshes indicat ..."
Abstract

Cited by 30 (4 self)
 Add to MetaCart
A parallel method for dynamic partitioning of unstructured meshes is described. The method employs a new iterative optimisation technique which both balances the workload and attempts to minimise the interprocessor communications overhead. Experiments on a series of adaptively refined meshes indicate that the algorithm provides partitions of an equivalent or higher quality to static partitioners (which do not reuse the existing partition) and much more quickly. Perhaps more importantly, the algorithm results in only a small fraction of the amount of data migration compared to the static partitioners. Key words. graphpartitioning, adaptive unstructured meshes, loadbalancing, parallel scientific computation. 1 Introduction The use of unstructured mesh codes on parallel machines can be one of the most efficient ways to solve large Computational Fluid Dynamics (CFD) and Computational Mechanics (CM) problems. Completely general geometries and complex behaviour can be readily modelled an...