Results 1  10
of
987,479
OPTICS: Ordering Points To Identify the Clustering Structure
, 1999
"... Cluster analysis is a primary method for database mining. It is either used as a standalone tool to get insight into the distribution of a data set, e.g. to focus further analysis and data processing, or as a preprocessing step for other algorithms operating on the detected clusters. Almost all of ..."
Abstract

Cited by 511 (49 self)
 Add to MetaCart
the intrinsic clustering structure accurately. We introduce a new algorithm for the purpose of cluster analysis which does not produce a clustering of a data set explicitly; but instead creates an augmented ordering of the database representing its densitybased clustering structure. This cluster
Adaptive clustering for mobile wireless networks
 IEEE Journal on Selected Areas in Communications
, 1997
"... This paper describes a selforganizing, multihop, mobile radio network, which relies on a code division access scheme for multimedia support. In the proposed network architecture, nodes are organized into nonoverlapping clusters. The clusters are independently controlled and are dynamically reconfig ..."
Abstract

Cited by 556 (11 self)
 Add to MetaCart
This paper describes a selforganizing, multihop, mobile radio network, which relies on a code division access scheme for multimedia support. In the proposed network architecture, nodes are organized into nonoverlapping clusters. The clusters are independently controlled and are dynamically
Clustering Gene Expression Patterns
, 1999
"... Recent advances in biotechnology allow researchers to measure expression levels for thousands of genes simultaneously, across different conditions and over time. Analysis of data produced by such experiments offers potential insight into gene function and regulatory mechanisms. A key step in the ana ..."
Abstract

Cited by 446 (11 self)
 Add to MetaCart
expression data. We define an appropriate stochastic error model on the input, and prove that under the conditions of the model, the algorithm recovers the cluster structure with high probability. The running time of the algorithm on an ngene dataset is O(n 2 (log(n)) c ). We also present a practical
ModelBased Clustering, Discriminant Analysis, and Density Estimation
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2000
"... Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little ..."
Abstract

Cited by 557 (28 self)
 Add to MetaCart
Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However
Distance Metric Learning, With Application To Clustering With SideInformation
 ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 15
, 2003
"... Many algorithms rely critically on being given a good metric over their inputs. For instance, data can often be clustered in many "plausible" ways, and if a clustering algorithm such as Kmeans initially fails to find one that is meaningful to a user, the only recourse may be for the us ..."
Abstract

Cited by 799 (14 self)
 Add to MetaCart
Many algorithms rely critically on being given a good metric over their inputs. For instance, data can often be clustered in many "plausible" ways, and if a clustering algorithm such as Kmeans initially fails to find one that is meaningful to a user, the only recourse may
Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering
 Advances in Neural Information Processing Systems 14
, 2001
"... Drawing on the correspondence between the graph Laplacian, the LaplaceBeltrami operator on a manifold, and the connections to the heat equation, we propose a geometrically motivated algorithm for constructing a representation for data sampled from a low dimensional manifold embedded in a higher ..."
Abstract

Cited by 664 (8 self)
 Add to MetaCart
higher dimensional space. The algorithm provides a computationally efficient approach to nonlinear dimensionality reduction that has locality preserving properties and a natural connection to clustering. Several applications are considered.
The structure and function of complex networks
 SIAM REVIEW
, 2003
"... Inspired by empirical studies of networked systems such as the Internet, social networks, and biological networks, researchers have in recent years developed a variety of techniques and models to help us understand or predict the behavior of these systems. Here we review developments in this field, ..."
Abstract

Cited by 2578 (7 self)
 Add to MetaCart
, including such concepts as the smallworld effect, degree distributions, clustering, network correlations, random graph models, models of network growth and preferential attachment, and dynamical processes taking place on networks.
Scatter/Gather: A Clusterbased Approach to Browsing Large Document Collections
, 1992
"... Document clustering has not been well received as an information retrieval tool. Objections to its use fall into two main categories: first, that clustering is too slow for large corpora (with running time often quadratic in the number of documents); and second, that clustering does not appreciably ..."
Abstract

Cited by 772 (12 self)
 Add to MetaCart
Document clustering has not been well received as an information retrieval tool. Objections to its use fall into two main categories: first, that clustering is too slow for large corpora (with running time often quadratic in the number of documents); and second, that clustering does not appreciably
GPFS: A SharedDisk File System for Large Computing Clusters
 In Proceedings of the 2002 Conference on File and Storage Technologies (FAST
, 2002
"... GPFS is IBM's parallel, shareddisk file system for cluster computers, available on the RS/6000 SP parallel supercomputer and on Linux clusters. GPFS is used on many of the largest supercomputers in the world. GPFS was built on many of the ideas that were developed in the academic community ove ..."
Abstract

Cited by 518 (3 self)
 Add to MetaCart
GPFS is IBM's parallel, shareddisk file system for cluster computers, available on the RS/6000 SP parallel supercomputer and on Linux clusters. GPFS is used on many of the largest supercomputers in the world. GPFS was built on many of the ideas that were developed in the academic community
Estimating the number of clusters in a dataset via the Gap statistic
, 2000
"... We propose a method (the \Gap statistic") for estimating the number of clusters (groups) in a set of data. The technique uses the output of any clustering algorithm (e.g. kmeans or hierarchical), comparing the change in within cluster dispersion to that expected under an appropriate reference ..."
Abstract

Cited by 492 (1 self)
 Add to MetaCart
We propose a method (the \Gap statistic") for estimating the number of clusters (groups) in a set of data. The technique uses the output of any clustering algorithm (e.g. kmeans or hierarchical), comparing the change in within cluster dispersion to that expected under an appropriate reference
Results 1  10
of
987,479