• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 20,026
Next 10 →

A comparison of document clustering techniques

by Michael Steinbach, George Karypis, Vipin Kumar - In KDD Workshop on Text Mining , 2000
"... This paper presents the results of an experimental study of some common document clustering techniques: agglomerative hierarchical clustering and K-means. (We used both a “standard” K-means algorithm and a “bisecting ” K-means algorithm.) Our results indicate that the bisecting K-means technique is ..."
Abstract - Cited by 613 (27 self) - Add to MetaCart
This paper presents the results of an experimental study of some common document clustering techniques: agglomerative hierarchical clustering and K-means. (We used both a “standard” K-means algorithm and a “bisecting ” K-means algorithm.) Our results indicate that the bisecting K-means technique

Clustering Techniques

by Hard Assignment
"... • Place similar objects in the same group and assign dissimilar objects to different groups – Word clustering • Neighbor overlap: words occur with the similar left and right neighbors (such as in and on) – Document clustering • Documents with the similar topics or concepts are put together • But clu ..."
Abstract - Add to MetaCart
• Place similar objects in the same group and assign dissimilar objects to different groups – Word clustering • Neighbor overlap: words occur with the similar left and right neighbors (such as in and on) – Document clustering • Documents with the similar topics or concepts are put together

Laplacian eigenmaps and spectral techniques for embedding and clustering.

by Mikhail Belkin , Partha Niyogi - Proceeding of Neural Information Processing Systems, , 2001
"... Abstract Drawing on the correspondence between the graph Laplacian, the Laplace-Beltrami op erator on a manifold , and the connections to the heat equation , we propose a geometrically motivated algorithm for constructing a representation for data sampled from a low dimensional manifold embedded in ..."
Abstract - Cited by 668 (7 self) - Add to MetaCart
in a higher dimensional space. The algorithm provides a computationally efficient approach to nonlinear dimensionality reduction that has locality preserving properties and a natural connection to clustering. Several applications are considered. In many areas of artificial intelligence, information

A Survey of Clustering Techniques

by Pradeep Rai, Shubha Singh
"... The goal of this survey is to provide a comprehensive review of different clustering techniques in data mining. 1. ..."
Abstract - Cited by 10 (0 self) - Add to MetaCart
The goal of this survey is to provide a comprehensive review of different clustering techniques in data mining. 1.

Literature Survey: Clustering Technique

by Ajinkya V Jiman , Prof K Harmeet , Khanuja , 2016
"... Abstract: Clustering is a partition of data into the groups of similar or dissimilar objects. Clustering is unsupervised learning technique helps to find out hidden patterns of Data Objects. These hidden patterns represent a data concept. Clustering is used in many data mining applications for data ..."
Abstract - Add to MetaCart
Abstract: Clustering is a partition of data into the groups of similar or dissimilar objects. Clustering is unsupervised learning technique helps to find out hidden patterns of Data Objects. These hidden patterns represent a data concept. Clustering is used in many data mining applications

On the Performance of Object Clustering Techniques

by Manolis Tsangaris, Jeffrey F. Naughton
"... We investigate the performance of some of the best-known object clustering algorithms on four different workloads based upon the Tektronix benchmark. For all four workloads, stochastic clustering gave the best performance for a variety of performance metrics. Since stochastic clustering is computati ..."
Abstract - Cited by 67 (0 self) - Add to MetaCart
We investigate the performance of some of the best-known object clustering algorithms on four different workloads based upon the Tektronix benchmark. For all four workloads, stochastic clustering gave the best performance for a variety of performance metrics. Since stochastic clustering

OPTICS: Ordering Points To Identify the Clustering Structure

by Mihael Ankerst, Markus M. Breunig, Hans-peter Kriegel, Jörg Sander , 1999
"... Cluster analysis is a primary method for database mining. It is either used as a stand-alone tool to get insight into the distribution of a data set, e.g. to focus further analysis and data processing, or as a preprocessing step for other algorithms operating on the detected clusters. Almost all of ..."
Abstract - Cited by 527 (51 self) - Add to MetaCart
.g. representative points, arbitrary shaped clusters), but also the intrinsic clustering structure. For medium sized data sets, the cluster-ordering can be represented graphically and for very large data sets, we introduce an appropriate visualization technique. Both are suitable for interactive exploration

Genetic algorithm-based clustering technique

by Ujjwal Maulik, Sanghamitra B - Pattern Recognition , 2000
"... A genetic algorithm-based clustering technique, called GA-clustering, is proposed in this article. The searching capability of genetic algorithms is exploited in order to search for appropriate cluster centres in the feature space such that a similarity metric of the resulting clusters is optimized. ..."
Abstract - Cited by 86 (0 self) - Add to MetaCart
A genetic algorithm-based clustering technique, called GA-clustering, is proposed in this article. The searching capability of genetic algorithms is exploited in order to search for appropriate cluster centres in the feature space such that a similarity metric of the resulting clusters is optimized

Scatter/Gather: A Cluster-based Approach to Browsing Large Document Collections

by Douglass R. Cutting, David R. Karger, Jan O. Pedersen, John W. Tukey , 1992
"... Document clustering has not been well received as an information retrieval tool. Objections to its use fall into two main categories: first, that clustering is too slow for large corpora (with running time often quadratic in the number of documents); and second, that clustering does not appreciably ..."
Abstract - Cited by 777 (12 self) - Add to MetaCart
improve retrieval. We argue that these problems arise only when clustering is used in an attempt to improve conventional search techniques. However, looking at clustering as an information access tool in its own right obviates these objections, and provides a powerful new access paradigm. We present a

Estimating the number of clusters in a dataset via the Gap statistic

by Robert Tibshirani, Guenther Walther, Trevor Hastie , 2000
"... We propose a method (the \Gap statistic") for estimating the number of clusters (groups) in a set of data. The technique uses the output of any clustering algorithm (e.g. k-means or hierarchical), comparing the change in within cluster dispersion to that expected under an appropriate reference ..."
Abstract - Cited by 502 (1 self) - Add to MetaCart
We propose a method (the \Gap statistic") for estimating the number of clusters (groups) in a set of data. The technique uses the output of any clustering algorithm (e.g. k-means or hierarchical), comparing the change in within cluster dispersion to that expected under an appropriate reference
Next 10 →
Results 1 - 10 of 20,026
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University