• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

DMCA

Clustering with Bregman Divergences (2005)

Cached

  • Download as a PDF

Download Links

  • [www.lans.ece.utexas.edu]
  • [www.cs.utexas.edu]
  • [www.ideal.ece.utexas.edu]
  • [staff.icar.cnr.it]
  • [dns2.icar.cnr.it]
  • [www.cs.utexas.edu]
  • [www2.cs.uh.edu]
  • [www.lix.polytechnique.fr]
  • [www.jmlr.org]
  • [jmlr.csail.mit.edu]
  • [people.ee.duke.edu]
  • [jmlr.org]
  • [www.lix.polytechnique.fr]
  • [people.ee.duke.edu]
  • [www.lans.ece.utexas.edu]
  • [hercules.ece.utexas.edu]

  • Other Repositories/Bibliography

  • DBLP
  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Arindam Banerjee , Srujana Merugu , Inderjit Dhillon , Joydeep Ghosh
Venue:JOURNAL OF MACHINE LEARNING RESEARCH
Citations:441 - 57 self
  • Summary
  • Citations
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@INPROCEEDINGS{Banerjee05clusteringwith,
    author = {Arindam Banerjee and Srujana Merugu and Inderjit Dhillon and Joydeep Ghosh},
    title = {Clustering with Bregman Divergences},
    booktitle = {JOURNAL OF MACHINE LEARNING RESEARCH},
    year = {2005},
    pages = {1705--1749},
    publisher = {JMLR.org}
}

Share

Facebook Twitter Reddit Bibsonomy

OpenURL

 

Abstract

A wide variety of distortion functions are used for clustering, e.g., squared Euclidean distance, Mahalanobis distance and relative entropy. In this paper, we propose and analyze parametric hard and soft clustering algorithms based on a large class of distortion functions known as Bregman divergences. The proposed algorithms unify centroid-based parametric clustering approaches, such as classical kmeans and information-theoretic clustering, which arise by special choices of the Bregman divergence. The algorithms maintain the simplicity and scalability of the classical kmeans algorithm, while generalizing the basic idea to a very large class of clustering loss functions. There are two main contributions in this paper. First, we pose the hard clustering problem in terms of minimizing the loss in Bregman information, a quantity motivated by rate-distortion theory, and present an algorithm to minimize this loss. Secondly, we show an explicit bijection between Bregman divergences and exponential families. The bijection enables the development of an alternative interpretation of an ecient EM scheme for learning models involving mixtures of exponential distributions. This leads to a simple soft clustering algorithm for all Bregman divergences.

Keyphrases

bregman divergence    distortion function    classical kmeans    large class    rate-distortion theory    loss function    hard clustering problem    relative entropy    exponential distribution    wide variety    alternative interpretation    exponential family    euclidean distance    basic idea    explicit bijection    special choice    bregman information    main contribution    ecient em scheme    mahalanobis distance    information-theoretic clustering   

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University