Results 1  10
of
81
A survey of kernel and spectral methods for clustering
, 2008
"... Clustering algorithms are a useful tool to explore data structures and have been employed in many disciplines. The focus of this paper is the partitioning clustering problem with a special interest in two recent approaches: kernel and spectral methods. The aim of this paper is to present a survey of ..."
Abstract

Cited by 88 (5 self)
 Add to MetaCart
Clustering algorithms are a useful tool to explore data structures and have been employed in many disciplines. The focus of this paper is the partitioning clustering problem with a special interest in two recent approaches: kernel and spectral methods. The aim of this paper is to present a survey of kernel and spectral clustering methods, two approaches able to produce nonlinear separating hypersurfaces between clusters. The presented kernel clustering methods are the kernel version of many classical clustering algorithms, e.g., Kmeans, SOM and neural gas. Spectral clustering arise from concepts in spectral graph theory and the clustering problem is configured as a graph cut problem where an appropriate objective function has to be optimized. An explicit proof of the fact that these two paradigms have the same objective is reported since it has been proven that these two seemingly different approaches have the same mathematical foundation. Besides, fuzzy kernel clustering methods are presented as extensions of kernel Kmeans clustering algorithm.
A Comparison of image segmentation algorithms”, The Robotics
, 2005
"... Unsupervised image segmentation algorithms have matured to the point where they generate reasonable segmentations, and thus can begin to be incorporated into larger systems. A system designer now has an array of available algorithm choices, however, few objective numerical evaluations exist of these ..."
Abstract

Cited by 31 (3 self)
 Add to MetaCart
(Show Context)
Unsupervised image segmentation algorithms have matured to the point where they generate reasonable segmentations, and thus can begin to be incorporated into larger systems. A system designer now has an array of available algorithm choices, however, few objective numerical evaluations exist of these segmentation algorithms. As a first step towards filling this gap, this paper presents an evaluation of two popular segmentation algorithms, the mean shiftbased segmentation algorithm and a graphbased segmentation scheme. We also consider a hybrid method which combines the other two methods. This quantitative evaluation is made possible by the recently proposed measure of segmentation correctness, the Normalized Probabilistic Rand (NPR) index, which allows a principled comparison between segmentations created by different algorithms, as well as segmentations on different images. For each algorithm, we consider its correctness as measured by the NPR index, as well as its stability with respect to changes in parameter settings and with respect to different images. An algorithm which produces correct segmentation results with
Spectral Methods for Mesh Processing and Analysis
 EUROGRAPHICS 2007
, 2007
"... Spectral methods for mesh processing and analysis rely on the eigenvalues, eigenvectors, or eigenspace projections derived from appropriately defined mesh operators to carry out desired tasks. Early works in this area can be traced back to the seminal paper by Taubin in 1995, where spectral analysis ..."
Abstract

Cited by 31 (0 self)
 Add to MetaCart
Spectral methods for mesh processing and analysis rely on the eigenvalues, eigenvectors, or eigenspace projections derived from appropriately defined mesh operators to carry out desired tasks. Early works in this area can be traced back to the seminal paper by Taubin in 1995, where spectral analysis of mesh geometry based on a combinatorial Laplacian aids our understanding of the lowpass filtering approach to mesh smoothing. Over the past ten years or so, the list of applications in the area of geometry processing which utilize the eigenstructures of a variety of mesh operators in different manners have been growing steadily. Many works presented so far draw parallels from developments in fields such as graph theory, computer vision, machine learning, graph drawing, numerical linear algebra, and highperformance computing. This stateoftheart report aims to provide a comprehensive survey on the spectral approach, focusing on its power and versatility in solving geometry processing problems and attempting to bridge the gap between relevant research in computer graphics and other fields. Necessary theoretical background will be provided and existing works will be classified according to different criteria — the operators or eigenstructures employed, application domains, or the dimensionality of the spectral embeddings used — and described in adequate length. Finally, despite much empirical success, there still remain many open questions pertaining to the spectral approach, which we will discuss in the report as well.
Clustering by weighted cuts in directed graphs
 In Proceedings of the 2007 SIAM International Conference on Data Mining
, 2007
"... In this paper we formulate spectral clustering in directed graphs as an optimization problem, the objective being a weighted cut in the directed graph. This objective extends several popular criteria like the normalized cut and the averaged cut to asymmetric affinity data. We show that this problem ..."
Abstract

Cited by 29 (1 self)
 Add to MetaCart
(Show Context)
In this paper we formulate spectral clustering in directed graphs as an optimization problem, the objective being a weighted cut in the directed graph. This objective extends several popular criteria like the normalized cut and the averaged cut to asymmetric affinity data. We show that this problem can be relaxed to a Rayleigh quotient problem for a symmetric matrix obtained from the original affinities and therefore a large body of the results and algorithms developed for spectral clustering of symmetric data immediately extends to asymmetric cuts. 1
Consensus Clusterings
"... In this paper we address the problem of combining multiple clusterings without access to the underlying features of the data. This process is known in the literature as clustering ensembles, clustering aggregation, or consensus clustering. Consensus clustering yields a stable and robust final cluste ..."
Abstract

Cited by 22 (0 self)
 Add to MetaCart
(Show Context)
In this paper we address the problem of combining multiple clusterings without access to the underlying features of the data. This process is known in the literature as clustering ensembles, clustering aggregation, or consensus clustering. Consensus clustering yields a stable and robust final clustering that is in agreement with multiple clusterings. We find that an iterative EMlike method is remarkably effective for this problem. We present three iterative algorithms for finding clustering consensus. An extensive empirical study compares our proposed algorithms with eleven other consensus clustering methods on four data sets using six different clustering performance metrics. The experimental results show that the new ensemble clustering methods produce clusterings that are as good as, and often better than, these other methods. 1.
DATA SPECTROSCOPY: EIGENSPACE OF CONVOLUTION OPERATORS AND CLUSTERING
, 2008
"... This paper focuses on obtaining clustering information in a distribution when iid data are given. First, we develop theoretical results for understanding and using clustering information contained in the eigenvectors of data adjacency matrices based on a radial kernel function (with a sufficiently f ..."
Abstract

Cited by 21 (4 self)
 Add to MetaCart
(Show Context)
This paper focuses on obtaining clustering information in a distribution when iid data are given. First, we develop theoretical results for understanding and using clustering information contained in the eigenvectors of data adjacency matrices based on a radial kernel function (with a sufficiently fast tail decay). We provide population analyses to give insights into which eigenvectors should be used and when the clustering information for the distribution can be recovered from the data. In particular, we learned that top eigenvectors do not contain all the clustering information. Second, we use heuristics from these analyses to design the Data Spectroscopic clustering (DaSpec) algorithm that uses properly selected top eigenvectors, determines the number of clusters, gives data labels, and provides a classification rule for future data, all based on only one eigen decomposition. Our findings not only extend and go beyond the intuitions underlying existing spectral techniques (e.g. spectral clustering and Kernel Principal Components Analysis), but also provide insights about their usability and modes of failure. Simulation studies and experiments on real world data are conducted to show the promise of our proposed data spectroscopy clustering algorithm relative to kmeans and one spectral method. In particular, DaSpec seems to be able to handle unbalanced groups and recover clusters of different shapes better than competing methods.
An Efficient Spectral Algorithm for Network Community Discovery and Its Applications to Biological and Social Networks
"... Automatic discovery of community structures in complex networks is a fundamental task in many disciplines, including social science, engineering, and biology. Recently, a quantitative measure called modularity (Q) has been proposed to effectively assess the quality of community structures. Several c ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
(Show Context)
Automatic discovery of community structures in complex networks is a fundamental task in many disciplines, including social science, engineering, and biology. Recently, a quantitative measure called modularity (Q) has been proposed to effectively assess the quality of community structures. Several community discovery algorithms have since been developed based on the optimization of Q. However, this optimization problem is NPhard, and the existing algorithms have a low accuracy or are computationally expensive. In this paper, we present an efficient spectral algorithm for modularity optimization. When tested on a large number of synthetic or realworld networks, and compared to the existing algorithms, our method is efficient and and has a high accuracy. We demonstrate our algorithm on three applications in biology, medicine, and social science. In the first application, we analyze the communities in a gene network, and show that genes in the same community usually have very similar functions, which enables us to predict functions for some new genes. Second, we apply the algorithm to group tumor samples based on gene expression microarray data. Remarkably, our algorithm can automatically detect different types of tumor without any prior knowledge, and by combining our results and clinical information, we can predict the outcomes of chemotherapies with a high accuracy. Finally, we analyze a social network of Usenet newsgroup users, and show that, without any semantic information, we can discover the organization of the newsgroups, and detect users groups with similar interests. 1
Clustering through Ranking on Manifolds
 In Proceedings of the 22nd international conference on Machine learning
, 2005
"... Clustering aims to find useful hidden structures in data. In this paper we present a new clustering algorithm that builds upon the consistency method (Zhou, et.al., 2003), a semisupervised learning technique with the property of learning very smooth functions with respect to the intrinsic str ..."
Abstract

Cited by 21 (1 self)
 Add to MetaCart
(Show Context)
Clustering aims to find useful hidden structures in data. In this paper we present a new clustering algorithm that builds upon the consistency method (Zhou, et.al., 2003), a semisupervised learning technique with the property of learning very smooth functions with respect to the intrinsic structure revealed by the data. Other methods, e.g. Spectral Clustering, obtain good results on data that reveals such a structure. However, unlike Spectral Clustering, our algorithm effectively detects both global and withinclass outliers, and the most representative examples in each class. Furthermore, we specify an optimization framework that estimates all learning parameters, including the number of clusters, directly from data. Finally, we show that the learned clustermodels can be used to add previously unseen points to clusters without relearning the original cluster model. Encouraging experimental results are obtained on a number of real world problems.
Multiway cuts and spectral clustering
, 2003
"... We look at spectral clustering as optimization. We show that near some special points called perfect, spectral clustering optimizes simultaneously two criteria: a dissimilarity measure that we call the multiway normalized cut (MNCut) and a cluster coherence measure that we call the gap. The immedia ..."
Abstract

Cited by 20 (0 self)
 Add to MetaCart
We look at spectral clustering as optimization. We show that near some special points called perfect, spectral clustering optimizes simultaneously two criteria: a dissimilarity measure that we call the multiway normalized cut (MNCut) and a cluster coherence measure that we call the gap. The immediate implication from the user's p.o.v is that spectral clustering will optimize any tradeoff between MNCut and gap which may explain its success in practice. Finally, we propose new methods for selecting K based on the gap and show their superior performance in experiments.
Incremental spectral clustering and seasons: Appearancebased localization in outdoor environments
 Proceedings of the International Conference on Robotics and Automation (ICRA
, 2008
"... Abstract — The problem of appearancebased mapping and navigation in outdoor environments is far from trivial. In this paper, an appearancebased topological map, covering a large, mixed indoor and outdoor environment, is built incrementally by using panoramic images. The map is based on image simil ..."
Abstract

Cited by 18 (0 self)
 Add to MetaCart
(Show Context)
Abstract — The problem of appearancebased mapping and navigation in outdoor environments is far from trivial. In this paper, an appearancebased topological map, covering a large, mixed indoor and outdoor environment, is built incrementally by using panoramic images. The map is based on image similarity, so that the resulting segmentation of the world corresponds closely to the human concept of a place. Using highresolution images and the epipolar constraint, the resulting map is shown to be very suitable for localization, even when the environment has undergone seasonal changes. I.