• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 14,537
Next 10 →

ROCK: A Robust Clustering Algorithm for Categorical Attributes

by Sudipto Guha, Rajeev Rastogi, Kyuseok Shim - In Proc.ofthe15thInt.Conf.onDataEngineering , 2000
"... Clustering, in data mining, is useful to discover distribution patterns in the underlying data. Clustering algorithms usually employ a distance metric based (e.g., euclidean) similarity measure in order to partition the database such that data points in the same partition are more similar than point ..."
Abstract - Cited by 446 (2 self) - Add to MetaCart
Clustering, in data mining, is useful to discover distribution patterns in the underlying data. Clustering algorithms usually employ a distance metric based (e.g., euclidean) similarity measure in order to partition the database such that data points in the same partition are more similar than

CURE: An Efficient Clustering Algorithm for Large Data sets

by Sudipto Guha, Rajeev Rastogi, Kyuseok Shim - Published in the Proceedings of the ACM SIGMOD Conference , 1998
"... Clustering, in data mining, is useful for discovering groups and identifying interesting distributions in the underlying data. Traditional clustering algorithms either favor clusters with spherical shapes and similar sizes, or are very fragile in the presence of outliers. We propose a new clustering ..."
Abstract - Cited by 722 (5 self) - Add to MetaCart
Clustering, in data mining, is useful for discovering groups and identifying interesting distributions in the underlying data. Traditional clustering algorithms either favor clusters with spherical shapes and similar sizes, or are very fragile in the presence of outliers. We propose a new

Randomized Gossip Algorithms

by Stephen Boyd, Arpita Ghosh, Balaji Prabhakar, Devavrat Shah - IEEE TRANSACTIONS ON INFORMATION THEORY , 2006
"... Motivated by applications to sensor, peer-to-peer, and ad hoc networks, we study distributed algorithms, also known as gossip algorithms, for exchanging information and for computing in an arbitrarily connected network of nodes. The topology of such networks changes continuously as new nodes join a ..."
Abstract - Cited by 532 (5 self) - Add to MetaCart
and old nodes leave the network. Algorithms for such networks need to be robust against changes in topology. Additionally, nodes in sensor networks operate under limited computational, communication, and energy resources. These constraints have motivated the design of “gossip ” algorithms: schemes which

Consistency of spectral clustering

by Ulrike von Luxburg, Mikhail Belkin, Olivier Bousquet , 2004
"... Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spe ..."
Abstract - Cited by 572 (15 self) - Add to MetaCart
Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family

Adaptive clustering for mobile wireless networks

by Chunhung Richard Lin, Mario Gerla - IEEE Journal on Selected Areas in Communications , 1997
"... This paper describes a self-organizing, multihop, mobile radio network, which relies on a code division access scheme for multimedia support. In the proposed network architecture, nodes are organized into nonoverlapping clusters. The clusters are independently controlled and are dynamically reconfig ..."
Abstract - Cited by 561 (11 self) - Add to MetaCart
reconfigured as nodes move. This network architecture has three main advantages. First, it provides spatial reuse of the bandwidth due to node clustering. Secondly, bandwidth can be shared or reserved in a controlled fashion in each cluster. Finally, the cluster algorithm is robust in the face of topological

Automatic Subspace Clustering of High Dimensional Data

by Rakesh Agrawal, Johannes Gehrke, Dimitrios Gunopulos, Prabhakar Raghavan - Data Mining and Knowledge Discovery , 2005
"... Data mining applications place special requirements on clustering algorithms including: the ability to find clusters embedded in subspaces of high dimensional data, scalability, end-user comprehensibility of the results, non-presumption of any canonical data distribution, and insensitivity to the or ..."
Abstract - Cited by 724 (12 self) - Add to MetaCart
Data mining applications place special requirements on clustering algorithms including: the ability to find clusters embedded in subspaces of high dimensional data, scalability, end-user comprehensibility of the results, non-presumption of any canonical data distribution, and insensitivity

Mean shift: A robust approach toward feature space analysis

by Dorin Comaniciu, Peter Meer - In PAMI , 2002
"... A general nonparametric technique is proposed for the analysis of a complex multimodal feature space and to delineate arbitrarily shaped clusters in it. The basic computational module of the technique is an old pattern recognition procedure, the mean shift. We prove for discrete data the convergence ..."
Abstract - Cited by 2395 (37 self) - Add to MetaCart
A general nonparametric technique is proposed for the analysis of a complex multimodal feature space and to delineate arbitrarily shaped clusters in it. The basic computational module of the technique is an old pattern recognition procedure, the mean shift. We prove for discrete data

OPTICS: Ordering Points To Identify the Clustering Structure

by Mihael Ankerst, Markus M. Breunig, Hans-peter Kriegel, Jörg Sander , 1999
"... Cluster analysis is a primary method for database mining. It is either used as a stand-alone tool to get insight into the distribution of a data set, e.g. to focus further analysis and data processing, or as a preprocessing step for other algorithms operating on the detected clusters. Almost all of ..."
Abstract - Cited by 527 (51 self) - Add to MetaCart
Cluster analysis is a primary method for database mining. It is either used as a stand-alone tool to get insight into the distribution of a data set, e.g. to focus further analysis and data processing, or as a preprocessing step for other algorithms operating on the detected clusters. Almost all

Cluster Ensembles - A Knowledge Reuse Framework for Combining Multiple Partitions

by Alexander Strehl, Joydeep Ghosh, Claire Cardie - Journal of Machine Learning Research , 2002
"... This paper introduces the problem of combining multiple partitionings of a set of objects into a single consolidated clustering without accessing the features or algorithms that determined these partitionings. We first identify several application scenarios for the resultant 'knowledge reuse&ap ..."
Abstract - Cited by 603 (20 self) - Add to MetaCart
This paper introduces the problem of combining multiple partitionings of a set of objects into a single consolidated clustering without accessing the features or algorithms that determined these partitionings. We first identify several application scenarios for the resultant 'knowledge reuse

Estimating the number of clusters in a dataset via the Gap statistic

by Robert Tibshirani, Guenther Walther, Trevor Hastie , 2000
"... We propose a method (the \Gap statistic") for estimating the number of clusters (groups) in a set of data. The technique uses the output of any clustering algorithm (e.g. k-means or hierarchical), comparing the change in within cluster dispersion to that expected under an appropriate reference ..."
Abstract - Cited by 502 (1 self) - Add to MetaCart
We propose a method (the \Gap statistic") for estimating the number of clusters (groups) in a set of data. The technique uses the output of any clustering algorithm (e.g. k-means or hierarchical), comparing the change in within cluster dispersion to that expected under an appropriate reference
Next 10 →
Results 1 - 10 of 14,537
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University