Clustering data streams: Theory and practice (2003)
Cached
Download Links
- [www.cs.virginia.edu]
- [www.cs.virginia.edu]
- [www-db.stanford.edu]
- [www.cs.ucla.edu]
- [infolab.stanford.edu]
- DBLP
Other Repositories/Bibliography
| Venue: | IEEE TKDE |
| Citations: | 75 - 2 self |
BibTeX
@ARTICLE{Guha03clusteringdata,
author = {Sudipto Guha and Adam Meyerson and Nina Mishra and Rajeev Motwani},
title = {Clustering data streams: Theory and practice},
journal = {IEEE TKDE},
year = {2003},
volume = {15},
pages = {515--528}
}
Years of Citing Articles
OpenURL
Abstract
Abstract—The data stream model has recently attracted attention for its applicability to numerous types of data, including telephone records, Web documents, and clickstreams. For analysis of such data, the ability to process the data in a single pass, or a small number of passes, while using little memory, is crucial. We describe such a streaming algorithm that effectively clusters large data streams. We also provide empirical evidence of the algorithm’s performance on synthetic and real data streams. Index Terms—Clustering, data streams, approximation algorithms. 1







