• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 6,317
Next 10 →

The stability of a good clustering

by unknown authors , 2011
"... If we have found a ”good ” clustering C of a data set, can we prove that C is not far from the (unknown) best clustering Copt of these data? Perhaps surprisingly, the answer to this question is sometimes yes. This paper proves spectral bounds on the distance d(C, Copt) for the case when “goodness ” ..."
Abstract - Add to MetaCart
If we have found a ”good ” clustering C of a data set, can we prove that C is not far from the (unknown) best clustering Copt of these data? Perhaps surprisingly, the answer to this question is sometimes yes. This paper proves spectral bounds on the distance d(C, Copt) for the case when “goodness

Distance metric learning, with application to clustering with sideinformation,”

by Eric P Xing , Andrew Y Ng , Michael I Jordan , Stuart Russell - in Advances in Neural Information Processing Systems 15, , 2002
"... Abstract Many algorithms rely critically on being given a good metric over their inputs. For instance, data can often be clustered in many "plausible" ways, and if a clustering algorithm such as K-means initially fails to find one that is meaningful to a user, the only recourse may be for ..."
Abstract - Cited by 818 (13 self) - Add to MetaCart
Abstract Many algorithms rely critically on being given a good metric over their inputs. For instance, data can often be clustered in many "plausible" ways, and if a clustering algorithm such as K-means initially fails to find one that is meaningful to a user, the only recourse may

On Spectral Clustering: Analysis and an algorithm

by Andrew Y. Ng, Michael I. Jordan, Yair Weiss - ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS , 2001
"... Despite many empirical successes of spectral clustering methods -- algorithms that cluster points using eigenvectors of matrices derived from the distances between the points -- there are several unresolved issues. First, there is a wide variety of algorithms that use the eigenvectors in slightly ..."
Abstract - Cited by 1713 (13 self) - Add to MetaCart
the algorithm, and give conditions under which it can be expected to do well. We also show surprisingly good experimental results on a number of challenging clustering problems.

A comparison of document clustering techniques

by Michael Steinbach, George Karypis, Vipin Kumar - In KDD Workshop on Text Mining , 2000
"... This paper presents the results of an experimental study of some common document clustering techniques: agglomerative hierarchical clustering and K-means. (We used both a “standard” K-means algorithm and a “bisecting ” K-means algorithm.) Our results indicate that the bisecting K-means technique is ..."
Abstract - Cited by 613 (27 self) - Add to MetaCart
This paper presents the results of an experimental study of some common document clustering techniques: agglomerative hierarchical clustering and K-means. (We used both a “standard” K-means algorithm and a “bisecting ” K-means algorithm.) Our results indicate that the bisecting K-means technique

Clustering by passing messages between data points

by Brendan J. Frey, Delbert Dueck - Science , 2007
"... Clustering data by identifying a subset of representative examples is important for processing sensory signals and detecting patterns in data. Such “exemplars ” can be found by randomly choosing an initial subset of data points and then iteratively refining it, but this works well only if that initi ..."
Abstract - Cited by 696 (8 self) - Add to MetaCart
if that initial choice is close to a good solution. We devised a method called “affinity propagation,” which takes as input measures of similarity between pairs of data points. Real-valued messages are exchanged between data points until a high-quality set of exemplars and corresponding clusters gradually emerges

A density-based algorithm for discovering clusters in large spatial databases with noise

by Martin Ester, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu , 1996
"... Clustering algorithms are attractive for the task of class identification in spatial databases. However, the application to large spatial databases rises the following requirements for clustering algorithms: minimal requirements of domain knowledge to determine the input parameters, discovery of clu ..."
Abstract - Cited by 1786 (70 self) - Add to MetaCart
of clusters with arbitrary shape and good efficiency on large databases. The well-known clustering algorithms offer no solution to the combination of these requirements. In this paper, we present the new clustering algorithm DBSCAN relying on a density-based notion of clusters which is designed to discover

On Clusterings: Good, Bad and Spectral

by Ravi Kannan , Santosh Vempala, Adrian Vetta , 2003
"... We motivate and develop a natural bicriteria measure for assessing the quality of a clustering which avoids the drawbacks of existing measures. A simple recursive heuristic is shown to have poly-logarithmic worst-case guarantees under the new measure. The main result of the paper is the analysis of ..."
Abstract - Cited by 332 (11 self) - Add to MetaCart
of a popular spectral algorithm. One variant of spectral clustering turns out to have effective worst-case guarantees; another finds a "good" clustering, if one exists.

Automatic Word Sense Discrimination

by Hinrich Schütze - Journal of Computational Linguistics , 1998
"... This paper presents context-group discrimination, a disambiguation algorithm based on clustering. Senses are interpreted as groups (or clusters) of similar contexts of the ambiguous word. Words, contexts, and senses are represented in Word Space, a high-dimensional, real-valued space in which closen ..."
Abstract - Cited by 536 (1 self) - Add to MetaCart
This paper presents context-group discrimination, a disambiguation algorithm based on clustering. Senses are interpreted as groups (or clusters) of similar contexts of the ambiguous word. Words, contexts, and senses are represented in Word Space, a high-dimensional, real-valued space in which

Web Document Clustering: A Feasibility Demonstration

by Oren Zamir, Oren Etzioni , 1998
"... Abstract Users of Web search engines are often forced to sift through the long ordered list of document “snippets” returned by the engines. The IR community has explored document clustering as an alternative method of organizing retrieval results, but clustering has yet to be deployed on the major s ..."
Abstract - Cited by 435 (3 self) - Add to MetaCart
that clusters based on snippets are almost as good as clusters created using the full text of Web documents. To satisfy the stringent requirements of the Web domain, we introduce an incremental, linear time (in the document collection size) algorithm called Suffix Tree Clustering (STC). which creates clusters

Tabu Search -- Part I

by Fred Glover , 1989
"... This paper presents the fundamental principles underlying tabu search as a strategy for combinatorial optimization problems. Tabu search has achieved impressive practical successes in applications ranging from scheduling and computer channel balancing to cluster analysis and space planning, and more ..."
Abstract - Cited by 680 (11 self) - Add to MetaCart
This paper presents the fundamental principles underlying tabu search as a strategy for combinatorial optimization problems. Tabu search has achieved impressive practical successes in applications ranging from scheduling and computer channel balancing to cluster analysis and space planning
Next 10 →
Results 1 - 10 of 6,317
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University