## The effectiveness of lloyd-type methods for the k-means problem (2006)

Venue: | In 47th IEEE Symposium on the Foundations of Computer Science (FOCS |

Citations: | 51 - 4 self |

### BibTeX

@INPROCEEDINGS{Ostrovsky06theeffectiveness,

author = {Rafail Ostrovsky and Yuval Rabani},

title = {The effectiveness of lloyd-type methods for the k-means problem},

booktitle = {In 47th IEEE Symposium on the Foundations of Computer Science (FOCS},

year = {2006},

pages = {165--176}

}

### Years of Citing Articles

### OpenURL

### Abstract

We investigate variants of Lloyd’s heuristic for clustering high dimensional data in an attempt to explain its popularity (a half century after its introduction) among practitioners, and in order to suggest improvements in its application. We propose and justify a clusterability criterion for data sets. We present variants of Lloyd’s heuristic that quickly lead to provably near-optimal clustering solutions when applied to well-clusterable instances. This is the first performance guarantee for a variant of Lloyd’s heuristic. The provision of a guarantee on output quality does not come at the expense of speed: some of our algorithms are candidates for being faster in practice than currently used variants of Lloyd’s method. In addition, our other algorithms are faster on well-clusterable instances than recently proposed approximation algorithms, while maintaining similar guarantees on clustering quality. Our main algorithmic contribution is a novel probabilistic seeding process for the starting configuration of a Lloyd-type iteration. 1.

### Citations

8168 | Maximum likelihood from incomplete data via the em algorithm
- Dempster, Laird, et al.
- 1977
(Show Context)
Citation Context ...nd less methodically in the same year by Cox [9]; also apparently by psychologists between 1959-67 [49]). This and similar iterative descent methods soon became the dominant approaches to the problem =-=[35, 33, 12, 31]-=- (see also [19, 20, 24] and the references therein); they remain so today, and are still being improved [1, 42, 44, 28]. Lloyd’s method (in any variant) converges only to local optima however, and is ... |

1875 | Some methods for classification and analysis of multivariate observation
- MacQueen
- 1967
(Show Context)
Citation Context ...nd less methodically in the same year by Cox [9]; also apparently by psychologists between 1959-67 [49]). This and similar iterative descent methods soon became the dominant approaches to the problem =-=[35, 33, 12, 31]-=- (see also [19, 20, 24] and the references therein); they remain so today, and are still being improved [1, 42, 44, 28]. Lloyd’s method (in any variant) converges only to local optima however, and is ... |

1662 |
Vector quantization and signal compression
- Gray, Gersho
- 1992
(Show Context)
Citation Context ... same year by Cox [9]; also apparently by psychologists between 1959-67 [49]). This and similar iterative descent methods soon became the dominant approaches to the problem [35, 33, 12, 31] (see also =-=[19, 20, 24]-=- and the references therein); they remain so today, and are still being improved [1, 42, 44, 28]. Lloyd’s method (in any variant) converges only to local optima however, and is sensitive to the choice... |

1348 |
Finding Groups in Data. An Introduction to Cluster Analysis
- Kaufman, Rousseeuw
- 1990
(Show Context)
Citation Context ...is sensitive to the choice of the initial centers [38]. Consequently, a lot of research has been directed toward seeding methods that try to start off Lloyd’s method with a good initial configuration =-=[18, 29, 17, 23, 46, 5, 36, 43]-=-. Very few theoretical guarantees are known about Lloyd’s method or its variants. The convergence rate of Lloyd’s method has recently been investigated in [10, 22, 2] and in particular, [2] shows that... |

1310 | Data clustering: a review
- Jain, Murty, et al.
- 1999
(Show Context)
Citation Context ... same year by Cox [9]; also apparently by psychologists between 1959-67 [49]). This and similar iterative descent methods soon became the dominant approaches to the problem [35, 33, 12, 31] (see also =-=[19, 20, 24]-=- and the references therein); they remain so today, and are still being improved [1, 42, 44, 28]. Lloyd’s method (in any variant) converges only to local optima however, and is sensitive to the choice... |

1221 |
An algorithm for vector quantizer design
- Linde, Buzo, et al.
- 1980
(Show Context)
Citation Context ...nd less methodically in the same year by Cox [9]; also apparently by psychologists between 1959-67 [50]). This and similar iterative descent methods soon became the dominant approaches to the problem =-=[35, 33, 12, 31]-=- (see also [19, 20, 24] and the references therein); they remain so today, and are still being improved [1, 43, 45, 28]. Lloyd’s method (in any variant) converges only to local optima however, and is ... |

849 | Least Squares Quantization in PCM
- Lloyd
- 1982
(Show Context)
Citation Context ... survey the most relevant literature here. The k-means problem seems to have been first considered by Steinhaus in 1956 [48]. A simple greedy iteration to minimize cost was suggested in 1957 by Lloyd =-=[32]-=- (and less methodically in the same year by Cox [9]; also apparently by psychologists between 1959-67 [49]). This and similar iterative descent methods soon became the dominant approaches to the probl... |

318 | Approximation algorithms for metric facility location and k-median problems using the primal-dual scheme and Lagrangian relaxation
- Jain, Vazirani
- 2001
(Show Context)
Citation Context ...se is the algorithm of Kumar, Sabharwal and Sen [30], which presents a linear time PTAS for a fixed k. There are also various constant-factor approximation algorithms for the related k-median problem =-=[26, 7, 6, 25, 37]-=-, which also yield approximation algorithms for k-means, and have running time polynomial in n, k and the dimension; recently Kanungo et al. [27] adapted the k-median algorithm of [3] to obtain a (9 +... |

249 | Latent semantics indexing: a probabilistic analysis
- Papadimitriou, Tamaki, et al.
- 1998
(Show Context)
Citation Context ...de the standard theoretical criteria; and in addition, whether theoretical scrutiny suggests improvements in their application. This is the approach we take in this paper. As in other prominent cases =-=[47, 41]-=-, such an analysis typically involves some abandonment of the worst-case inputs criterion. (In fact, part of the challenge is to identify simple conditions on the input, that allow one to prove a perf... |

236 | Refining initial points for K-Means clustering
- Bradley, Fayyad
- 1998
(Show Context)
Citation Context ...is sensitive to the choice of the initial centers [38]. Consequently, a lot of research has been directed toward seeding methods that try to start off Lloyd’s method with a good initial configuration =-=[18, 29, 17, 23, 46, 5, 36, 43]-=-. Very few theoretical guarantees are known about Lloyd’s method or its variants. The convergence rate of Lloyd’s method has recently been investigated in [10, 22, 2] and in particular, [2] shows that... |

231 | Local search heuristics for k-median and facility location problems
- Arya, Garg, et al.
(Show Context)
Citation Context ...m [26, 7, 6, 25, 37], which also yield approximation algorithms for k-means, and have running time polynomial in n, k and the dimension; recently Kanungo et al. [27] adapted the k-median algorithm of =-=[3]-=- to obtain a (9 + ɛ)-approximation algorithm for k-means. However, none of these methods match the simplicity and speed of the popular Lloyd’s method. Researchers concerned with the runtime of Lloyd’s... |

218 | An efficient k-means clustering algorithm: Analysis and implementation
- Kanungo, Mount
- 2002
(Show Context)
Citation Context ...ar iterative descent methods soon became the dominant approaches to the problem [35, 33, 12, 31] (see also [19, 20, 24] and the references therein); they remain so today, and are still being improved =-=[1, 42, 44, 28]-=-. Lloyd’s method (in any variant) converges only to local optima however, and is sensitive to the choice of the initial centers [38]. Consequently, a lot of research has been directed toward seeding m... |

211 | A constant-factor approximation algorithm for the k-median problem
- Charikar, Guha, et al.
- 1999
(Show Context)
Citation Context ...se is the algorithm of Kumar, Sabharwal and Sen [30], which presents a linear time PTAS for a fixed k. There are also various constant-factor approximation algorithms for the related k-median problem =-=[26, 7, 6, 25, 37]-=-, which also yield approximation algorithms for k-means, and have running time polynomial in n, k and the dimension; recently Kanungo et al. [27] adapted the k-median algorithm of [3] to obtain a (9 +... |

207 |
Quantizing for minimum distortion
- Max
- 1960
(Show Context)
Citation Context ...nd less methodically in the same year by Cox [9]; also apparently by psychologists between 1959-67 [49]). This and similar iterative descent methods soon became the dominant approaches to the problem =-=[35, 33, 12, 31]-=- (see also [19, 20, 24] and the references therein); they remain so today, and are still being improved [1, 42, 44, 28]. Lloyd’s method (in any variant) converges only to local optima however, and is ... |

204 | Improved combinatorial algorithms for the facility location and kmedian problems
- Charikar, Guha
- 1999
(Show Context)
Citation Context ...se is the algorithm of Kumar, Sabharwal and Sen [30], which presents a linear time PTAS for a fixed k. There are also various constant-factor approximation algorithms for the related k-median problem =-=[26, 7, 6, 25, 37]-=-, which also yield approximation algorithms for k-means, and have running time polynomial in n, k and the dimension; recently Kanungo et al. [27] adapted the k-median algorithm of [3] to obtain a (9 +... |

146 | Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time
- Spielman, Teng
- 2004
(Show Context)
Citation Context ...de the standard theoretical criteria; and in addition, whether theoretical scrutiny suggests improvements in their application. This is the approach we take in this paper. As in other prominent cases =-=[47, 41]-=-, such an analysis typically involves some abandonment of the worst-case inputs criterion. (In fact, part of the challenge is to identify simple conditions on the input, that allow one to prove a perf... |

111 | Approximate clustering via core-sets
- Bǎdoiu, Har-Peled, et al.
- 2002
(Show Context)
Citation Context ...algorithms for this problem. Matouˇsek [34] gave the first PTAS for this problem, with running time polynomial in n, for a fixed k and dimension. Subsequently a succession of algorithms have appeared =-=[40, 4, 11, 15, 16, 21, 30]-=- with varying runtime dependency on n, k and the dimension. The most recent of these is the algorithm of Kumar, Sabharwal and Sen [30], which presents a linear time PTAS for a fixed k. There are also ... |

109 | Smooth sensitivity and sampling in private data analysis
- NISSIM, RASKHODNIKOVA, et al.
(Show Context)
Citation Context ...paration condition (or similar ones) for the k-means and possibly other clustering problems. For instance, it might be possible to obtain stronger, or more general, algorithmic results. Nissim et al. =-=[40]-=- have obtained a result in this vein: they exploit the robustness of our separation condition to design secure, privacy-preserving ways of computing a near-optimal k-means solution when the data satis... |

106 | Optimization and Simplification of Hierarchical Clusterings
- Fisher
- 1995
(Show Context)
Citation Context ...is sensitive to the choice of the initial centers [38]. Consequently, a lot of research has been directed toward seeding methods that try to start off Lloyd’s method with a good initial configuration =-=[18, 29, 17, 23, 46, 5, 36, 43]-=-. Very few theoretical guarantees are known about Lloyd’s method or its variants. The convergence rate of Lloyd’s method has recently been investigated in [10, 22, 2] and in particular, [2] shows that... |

106 | Accelerating exact k-means algorithms with geometric reasoning
- Pelleg, Moore
- 1999
(Show Context)
Citation Context ...ar iterative descent methods soon became the dominant approaches to the problem [35, 33, 12, 31] (see also [19, 20, 24] and the references therein); they remain so today, and are still being improved =-=[1, 42, 44, 28]-=-. Lloyd’s method (in any variant) converges only to local optima however, and is sensitive to the choice of the initial centers [38]. Consequently, a lot of research has been directed toward seeding m... |

101 | Greedy facility location algorithms analyzed using dual fitting with factor-revealing LP
- Jain, Mahdian, et al.
(Show Context)
Citation Context |

98 | An empirical comparison of four initialization methods for the K-Means algorithm, Pattern recognition letters 20
- Pena, Lozano, et al.
- 1999
(Show Context)
Citation Context |

80 | An experimental comparison of several clustering and initialization methods
- Meila, Heckerman
- 1998
(Show Context)
Citation Context |

73 | Clustering large graphs via the singular value decomposition
- Drineas, Frieze, et al.
- 2004
(Show Context)
Citation Context ...ecently been investigated in [10, 22, 2] and in particular, [2] shows that Lloyd’s method can require a superpolynomial number of iterations to converge. The k-means problem is NP-hard even for k = 2 =-=[13]-=-. Recently there has been substantial progress in developing approximation algorithms for this problem. Matouˇsek [34] gave the first PTAS for this problem, with running time polynomial in n, for a fi... |

70 | A local search approximation algorithm for k-means clustering
- Kanungo, Mount, et al.
- 2002
(Show Context)
Citation Context ...orithms for the related k-median problem [26, 7, 6, 25, 37], which also yield approximation algorithms for k-means, and have running time polynomial in n, k and the dimension; recently Kanungo et al. =-=[27]-=- adapted the k-median algorithm of [3] to obtain a (9 + ɛ)-approximation algorithm for k-means. However, none of these methods match the simplicity and speed of the popular Lloyd’s method. Researchers... |

59 |
An Examination of the Effect of Six Types of Error Perturbation on Fifteen Clustering Algorithms
- Milligan
- 1980
(Show Context)
Citation Context ...ein); they remain so today, and are still being improved [1, 42, 44, 28]. Lloyd’s method (in any variant) converges only to local optima however, and is sensitive to the choice of the initial centers =-=[38]-=-. Consequently, a lot of research has been directed toward seeding methods that try to start off Lloyd’s method with a good initial configuration [18, 29, 17, 23, 46, 5, 36, 43]. Very few theoretical ... |

56 | An efficient k-means clustering algorithm
- Alsabti, Ranka, et al.
- 1998
(Show Context)
Citation Context ...ar iterative descent methods soon became the dominant approaches to the problem [35, 33, 12, 31] (see also [19, 20, 24] and the references therein); they remain so today, and are still being improved =-=[1, 42, 44, 28]-=-. Lloyd’s method (in any variant) converges only to local optima however, and is sensitive to the choice of the initial centers [38]. Consequently, a lot of research has been directed toward seeding m... |

55 |
Yuval Rabani. Approximation schemes for clustering problems
- Vega, Karpinski, et al.
- 2003
(Show Context)
Citation Context ...algorithms for this problem. Matouˇsek [34] gave the first PTAS for this problem, with running time polynomial in n, for a fixed k and dimension. Subsequently a succession of algorithms have appeared =-=[40, 4, 11, 15, 16, 21, 30]-=- with varying runtime dependency on n, k and the dimension. The most recent of these is the algorithm of Kumar, Sabharwal and Sen [30], which presents a linear time PTAS for a fixed k. There are also ... |

36 | How slow is the k-means method
- Arthur, Vassilvitskii
- 2006
(Show Context)
Citation Context ...configuration [18, 29, 17, 23, 46, 5, 36, 43]. Very few theoretical guarantees are known about Lloyd’s method or its variants. The convergence rate of Lloyd’s method has recently been investigated in =-=[10, 22, 2]-=- and in particular, [2] shows that Lloyd’s method can require a superpolynomial number of iterations to converge. The k-means problem is NP-hard even for k = 2 [13]. Recently there has been substantia... |

36 |
Sur la division des corp materiels en parties
- Steinhaus
(Show Context)
Citation Context ...models of k point sources under spherically symmetric Gaussian noise. We briefly survey the most relevant literature here. The k-means problem seems to have been first considered by Steinhaus in 1956 =-=[48]-=-. A simple greedy iteration to minimize cost was suggested in 1957 by Lloyd [32] (and less methodically in the same year by Cox [9]; also apparently by psychologists between 1959-67 [49]). This and si... |

35 |
MAZUMDAR: On coresets for k-means and k-median clustering
- HAR-PELED, S
- 2004
(Show Context)
Citation Context ...algorithms for this problem. Matouˇsek [34] gave the first PTAS for this problem, with running time polynomial in n, for a fixed k and dimension. Subsequently a succession of algorithms have appeared =-=[40, 4, 11, 15, 16, 21, 30]-=- with varying runtime dependency on n, k and the dimension. The most recent of these is the algorithm of Kumar, Sabharwal and Sen [30], which presents a linear time PTAS for a fixed k. There are also ... |

31 | Plaxton. Optimal time bounds for approximate clustering
- Mettu, Greg
- 2004
(Show Context)
Citation Context |

31 | Polynomial time approximation schemes for geometric clustering problems
- Ostrovsky, Rabani
- 2002
(Show Context)
Citation Context |

30 | Clustering for edge-cost minimization
- Schulman
- 2000
(Show Context)
Citation Context ...ct, part of the challenge is to identify simple conditions on the input, that allow one to prove a performance guarantee of wide applicability.) Our starting point is the notion that (as discussed in =-=[45]-=-) one should be concerned with k-clustering data that possesses a meaningful k-clustering. ∗ rafail@cs.ucla.edu Computer Science Department, University of California at Los Angeles, 90095, USA. Suppor... |

26 | How Fast is the k-Means Method
- Har-Peled, Sadri
- 2005
(Show Context)
Citation Context ...configuration [18, 29, 17, 23, 46, 5, 36, 43]. Very few theoretical guarantees are known about Lloyd’s method or its variants. The convergence rate of Lloyd’s method has recently been investigated in =-=[10, 22, 2]-=- and in particular, [2] shows that Lloyd’s method can require a superpolynomial number of iterations to converge. The k-means problem is NP-hard even for k = 2 [13]. Recently there has been substantia... |

26 |
Randomized Algorithms (Cambridge
- Motwani, Raghavan
- 1995
(Show Context)
Citation Context ...we bound Pr[WN > 0]. Define Zt = Yt + t(1 − 5ρ). Then E � � Zt+1|Z1, . . . , Zt ≤ Zt, so Z0, Z1, . . . forms a supermartingale. Clearly |Zt+1 − Zt| ≤ 1 for all t. So by Azuma’s inequality (see, e.g., =-=[39]-=-), Pr[ZN − Z0 > � 2N ln(2/δ)] ≤ δ which implies that WN ≤ k + � 2N ln(2/δ) − N(1 − 5ρ) with probability at least 1 − δ. Plugging the value of N shows that N(1 − 5ρ) − � 2N ln(2/δ) ≥ k. Corollary 4.5 (... |

20 |
Comparison of algorithms for dissimilarity-based compound selection
- Snarey, Terret, et al.
- 1997
(Show Context)
Citation Context |

18 |
Cluster analysis of multivariate data: Efficiency vs. interpretability of classification
- Forgey
- 1965
(Show Context)
Citation Context |

18 | A simple linear time (1 + ε)-approximation algorithm for k-means clustering in any dimensions
- KUMAR, SABHARWAL, et al.
(Show Context)
Citation Context ...ariants, which are used with multiple re-seedings and many Lloyd steps per re-seeding. (iii) We also give a PTAS by combining our seeding process with a sampling procedure of Kumar, Sabharwal and Sen =-=[30]-=-, whose running time is linear in |X| and exponential in k. This PTAS is significantly faster, and also simpler, than the PTAS of Kumar et al. [30] (applying the separation condition to both algorithm... |

15 |
Note on Grouping
- Cox
- 1957
(Show Context)
Citation Context ...ns problem seems to have been first considered by Steinhaus in 1956 [48]. A simple greedy iteration to minimize cost was suggested in 1957 by Lloyd [32] (and less methodically in the same year by Cox =-=[9]-=-; also apparently by psychologists between 1959-67 [49]). This and similar iterative descent methods soon became the dominant approaches to the problem [35, 33, 12, 31] (see also [19, 20, 24] and the ... |

12 | On approximate geometric k-clustering
- Matousek
(Show Context)
Citation Context ...l number of iterations to converge. The k-means problem is NP-hard even for k = 2 [13]. Recently there has been substantial progress in developing approximation algorithms for this problem. Matouˇsek =-=[34]-=- gave the first PTAS for this problem, with running time polynomial in n, for a fixed k and dimension. Subsequently a succession of algorithms have appeared [40, 4, 11, 15, 16, 21, 30] with varying ru... |

10 | Accelerating exact k -means algorithms with geometric reasoning - Pelleg, Moore - 1999 |

8 |
Deterministic clustering with data nets
- Effros, Schulman
- 2004
(Show Context)
Citation Context |

8 |
Experimental designs for selecting molecules from large chemical databases
- Higgs, Bemis, et al.
- 1997
(Show Context)
Citation Context |

7 |
How fast is k-means
- Dasgupta
- 2003
(Show Context)
Citation Context ...configuration [18, 29, 17, 23, 46, 5, 36, 43]. Very few theoretical guarantees are known about Lloyd’s method or its variants. The convergence rate of Lloyd’s method has recently been investigated in =-=[10, 22, 2]-=- and in particular, [2] shows that Lloyd’s method can require a superpolynomial number of iterations to converge. The k-means problem is NP-hard even for k = 2 [13]. Recently there has been substantia... |

6 | The reverse greedy algorithm for the metric k-median problem
- Chrobak, Kenyon, et al.
(Show Context)
Citation Context ...lete points (and move centers) until there are k centers left. The running time here is O(n3d). Our deletion procedure is similar to the reverse greedy algorithm proposed by Chrobak, Kenyon and Young =-=[8]-=- for the k-median problem. Chrobak et al. show that their reverse greedy algorithm attains an approximation ratio of O(log n), which is tight up to a factor of log log n. In contrast, for the k-means ... |

4 |
How fast is the k-means method? Algorithmica
- Har-Peled, Sadri
- 2005
(Show Context)
Citation Context ... configuration [18, 29, 17, 23, 46, 5, 36, 43]. Very few theoretical guarantees are known about Lloyd's method orits variants. The convergence rate of Lloyd's method has recently been investigated in =-=[10, 22, 2]-=- and in particular, [2] shows that Lloyd's method can require a superpolynomial number of iterations to converge.The k-means problem is NP-hard even for k = 2 [13]. Recently there has been substantial... |

3 |
Acceleration of k-means and related clustering problems
- Phillips
- 2002
(Show Context)
Citation Context |

3 | Local search heuristics for�-median and facility location problems - Arya, Garg, et al. - 2001 |

2 |
Deterministic clustering with data nets,” ECCC
- Effros, Schulman
- 2004
(Show Context)
Citation Context |