Results 1  10
of
11,882
Constrained Kmeans Clustering with Background Knowledge
 In ICML
, 2001
"... Clustering is traditionally viewed as an unsupervised method for data analysis. However, in some cases information about the problem domain is available in addition to the data instances themselves. In this paper, we demonstrate how the popular kmeans clustering algorithm can be pro tably modi ed ..."
Abstract

Cited by 488 (9 self)
 Add to MetaCart
Clustering is traditionally viewed as an unsupervised method for data analysis. However, in some cases information about the problem domain is available in addition to the data instances themselves. In this paper, we demonstrate how the popular kmeans clustering algorithm can be pro tably modi ed
Kmeans++: The advantages of careful seeding.
 In Proceedings of the Eighteenth Annual ACMSIAM Symposium on Discrete Algorithms, SODA ’07,
, 2007
"... Abstract The kmeans method is a widely used clustering technique that seeks to minimize the average squared distance between points in the same cluster. Although it offers no accuracy guarantees, its simplicity and speed are very appealing in practice. By augmenting kmeans with a very simple, ran ..."
Abstract

Cited by 478 (8 self)
 Add to MetaCart
Abstract The kmeans method is a widely used clustering technique that seeks to minimize the average squared distance between points in the same cluster. Although it offers no accuracy guarantees, its simplicity and speed are very appealing in practice. By augmenting kmeans with a very simple
An Efficient kMeans Clustering Algorithm: Analysis and Implementation
, 2000
"... Kmeans clustering is a very popular clustering technique, which is used in numerous applications. Given a set of n data points in R d and an integer k, the problem is to determine a set of k points R d , called centers, so as to minimize the mean squared distance from each data point to its ..."
Abstract

Cited by 417 (4 self)
 Add to MetaCart
Kmeans clustering is a very popular clustering technique, which is used in numerous applications. Given a set of n data points in R d and an integer k, the problem is to determine a set of k points R d , called centers, so as to minimize the mean squared distance from each data point to its
Xmeans: Extending Kmeans with Efficient Estimation of the Number of Clusters
 In Proceedings of the 17th International Conf. on Machine Learning
, 2000
"... Despite its popularity for general clustering, Kmeans suffers three major shortcomings; it scales poorly computationally, the number of clusters K has to be supplied by the user, and the search is prone to local minima. We propose solutions for the first two problems, and a partial remedy for the t ..."
Abstract

Cited by 418 (5 self)
 Add to MetaCart
Despite its popularity for general clustering, Kmeans suffers three major shortcomings; it scales poorly computationally, the number of clusters K has to be supplied by the user, and the search is prone to local minima. We propose solutions for the first two problems, and a partial remedy
A comparison of document clustering techniques
 In KDD Workshop on Text Mining
, 2000
"... This paper presents the results of an experimental study of some common document clustering techniques: agglomerative hierarchical clustering and Kmeans. (We used both a “standard” Kmeans algorithm and a “bisecting ” Kmeans algorithm.) Our results indicate that the bisecting Kmeans technique is ..."
Abstract

Cited by 613 (27 self)
 Add to MetaCart
This paper presents the results of an experimental study of some common document clustering techniques: agglomerative hierarchical clustering and Kmeans. (We used both a “standard” Kmeans algorithm and a “bisecting ” Kmeans algorithm.) Our results indicate that the bisecting Kmeans technique
Mean shift, mode seeking, and clustering
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 1995
"... Mean shift, a simple iterative procedure that shifts each data point to the average of data points in its neighborhood, is generalized and analyzed in this paper. This generalization makes some kmeans like clustering algorithms its special cases. It is shown that mean shift is a modeseeking proce ..."
Abstract

Cited by 624 (0 self)
 Add to MetaCart
Mean shift, a simple iterative procedure that shifts each data point to the average of data points in its neighborhood, is generalized and analyzed in this paper. This generalization makes some kmeans like clustering algorithms its special cases. It is shown that mean shift is a mode
Distance metric learning, with application to clustering with sideinformation,”
 in Advances in Neural Information Processing Systems 15,
, 2002
"... Abstract Many algorithms rely critically on being given a good metric over their inputs. For instance, data can often be clustered in many "plausible" ways, and if a clustering algorithm such as Kmeans initially fails to find one that is meaningful to a user, the only recourse may be for ..."
Abstract

Cited by 818 (13 self)
 Add to MetaCart
Abstract Many algorithms rely critically on being given a good metric over their inputs. For instance, data can often be clustered in many "plausible" ways, and if a clustering algorithm such as Kmeans initially fails to find one that is meaningful to a user, the only recourse may
Estimating the number of clusters in a dataset via the Gap statistic
, 2000
"... We propose a method (the \Gap statistic") for estimating the number of clusters (groups) in a set of data. The technique uses the output of any clustering algorithm (e.g. kmeans or hierarchical), comparing the change in within cluster dispersion to that expected under an appropriate reference ..."
Abstract

Cited by 502 (1 self)
 Add to MetaCart
We propose a method (the \Gap statistic") for estimating the number of clusters (groups) in a set of data. The technique uses the output of any clustering algorithm (e.g. kmeans or hierarchical), comparing the change in within cluster dispersion to that expected under an appropriate reference
Refining Initial Points for KMeans Clustering
, 1998
"... Practical approaches to clustering use an iterative procedure (e.g. KMeans, EM) which converges to one of numerous local minima. It is known that these iterative techniques are especially sensitive to initial starting conditions. We present a procedure for computing a refined starting condition fro ..."
Abstract

Cited by 317 (5 self)
 Add to MetaCart
Practical approaches to clustering use an iterative procedure (e.g. KMeans, EM) which converges to one of numerous local minima. It is known that these iterative techniques are especially sensitive to initial starting conditions. We present a procedure for computing a refined starting condition
Data Clustering: 50 Years Beyond KMeans
, 2008
"... Organizing data into sensible groupings is one of the most fundamental modes of understanding and learning. As an example, a common scheme of scientific classification puts organisms into taxonomic ranks: domain, kingdom, phylum, class, etc.). Cluster analysis is the formal study of algorithms and m ..."
Abstract

Cited by 294 (7 self)
 Add to MetaCart
(unsupervised learning) from classification or discriminant analysis (supervised learning). The aim of clustering is exploratory in nature to find structure in data. Clustering has a long and rich history in a variety of scientific fields. One of the most popular and simple clustering algorithms, Kmeans
Results 1  10
of
11,882