• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

DMCA

BIRCH: an efficient data clustering method for very large databases (1996)

Cached

  • Download as a PDF

Download Links

  • [www.ssrc.ucsc.edu]
  • [www.cs.uiuc.edu]
  • [www.lans.ece.utexas.edu]
  • [cs.ubc.ca]
  • [www.cs.ubc.ca]
  • [hercules.ece.utexas.edu]
  • [www.cs.wisc.edu]
  • [www.cs.uoi.gr]
  • [dsl.serc.iisc.ernet.in]
  • [www.cs.uoi.gr]
  • [www.cs.uoi.gr]
  • [people.cs.ubc.ca]
  • [www.cse.iitm.ac.in]
  • [zeus.cs.uoi.gr]
  • [people.cs.ubc.ca]
  • [www.facweb.iitkgp.ernet.in]
  • [www.facweb.iitkgp.ernet.in]
  • [www.cs.ecu.edu]
  • [www.it.iitb.ac.in]
  • [www.math.unipd.it]
  • [zeus.cs.uoi.gr]
  • [www.facweb.iitkgp.ernet.in]
  • [www.facweb.iitkgp.ernet.in]
  • [staff.icar.cnr.it]
  • [www.cs.ubc.ca]
  • [zeus.cs.uoi.gr]
  • [people.cs.ubc.ca]
  • [www.cs.uoi.gr]
  • [www.cs.uoi.gr]
  • [www.cs.uoi.gr]
  • [zeus.cs.uoi.gr]
  • [zeus.cs.uoi.gr]
  • [zeus.cs.uoi.gr]
  • [www.cse.iitm.ac.in]
  • [www.facweb.iitkgp.ernet.in]
  • [people.cs.ubc.ca]
  • [www.it.iitb.ac.in]
  • [people.cs.ubc.ca]
  • [www.facweb.iitkgp.ernet.in]
  • [www.math.unipd.it]
  • [www.cs.ecu.edu]
  • [www.facweb.iitkgp.ernet.in]
  • [www.facweb.iitkgp.ernet.in]
  • [www.ece.nwu.edu]
  • [web.cacs.louisiana.edu]

  • Other Repositories/Bibliography

  • DBLP
  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Tian Zhang , Raghu Ramakrishnan , Miron Livny
Venue:In Proc. of the ACM SIGMOD Intl. Conference on Management of Data (SIGMOD
Citations:576 - 2 self
  • Summary
  • Citations
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@INPROCEEDINGS{Zhang96birch:an,
    author = {Tian Zhang and Raghu Ramakrishnan and Miron Livny},
    title = {BIRCH: an efficient data clustering method for very large databases},
    booktitle = {In Proc. of the ACM SIGMOD Intl. Conference on Management of Data (SIGMOD},
    year = {1996},
    pages = {103--114}
}

Share

Facebook Twitter Reddit Bibsonomy

OpenURL

 

Abstract

Finding useful patterns in large datasets has attracted considerable interest recently, and one of the most widely st,udied problems in this area is the identification of clusters, or deusel y populated regions, in a multi-dir nensional clataset. Prior work does not adequately address the problem of large datasets and minimization of 1/0 costs. This paper presents a data clustering method named Bfll (;”H (Balanced Iterative Reducing and Clustering using Hierarchies), and demonstrates that it is especially suitable for very large databases. BIRCH incrementally and clynamicall y clusters incoming multi-dimensional metric data points to try to produce the best quality clustering with the available resources (i. e., available memory and time constraints). BIRCH can typically find a goocl clustering with a single scan of the data, and improve the quality further with a few aclditioual scans. BIRCH is also the first clustering algorithm proposerl in the database area to handle “noise) ’ (data points that are not part of the underlying pattern) effectively. We evaluate BIRCH’S time/space efficiency, data input order sensitivity, and clustering quality through several experiments. We also present a performance comparisons of BIR (;’H versus CLARA NS, a clustering method proposed recently for large datasets, and S11OW that BIRCH is consistently 1

Keyphrases

large database    large datasets    efficient data    database area    single scan    considerable interest    balanced iterative reducing    birch time space efficiency    underlying pattern    data clustering method    versus clara n    multi-dimensional metric data point    performance comparison    multi-dir nensional clataset    populated region    udied problem    useful pattern    available memory    first clustering algorithm proposerl    available resource    several experiment    data point    prior work    time constraint    data input order sensitivity    aclditioual scan   

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University