Results 1 -
6 of
6
OPTICS: Ordering Points To Identify the Clustering Structure
, 1999
"... Cluster analysis is a primary method for database mining. It is either used as a stand-alone tool to get insight into the distribution of a data set, e.g. to focus further analysis and data processing, or as a preprocessing step for other algorithms operating on the detected clusters. Almost all of ..."
Abstract
-
Cited by 262 (42 self)
- Add to MetaCart
Cluster analysis is a primary method for database mining. It is either used as a stand-alone tool to get insight into the distribution of a data set, e.g. to focus further analysis and data processing, or as a preprocessing step for other algorithms operating on the detected clusters. Almost all of the well-known clustering algorithms require input parameters which are hard to determine but have a significant influence on the clustering result. Furthermore, for many real-data sets there does not even exist a global parameter setting for which the result of the clustering algorithm describes the intrinsic clustering structure accurately. We introduce a new algorithm for the purpose of cluster analysis which does not produce a clustering of a data set explicitly; but instead creates an augmented ordering of the database representing its density-based clustering structure. This cluster-ordering contains information which is equivalent to the density-based clusterings corresponding to a broad range of parameter settings. It is a versatile basis for both automatic and interactive cluster analysis. We show how to automatically and efficiently extract not only ‘traditional ’ clustering information (e.g. representative points, arbitrary shaped clusters), but also the intrinsic clustering structure. For medium sized data sets, the cluster-ordering can be represented graphically and for very large data sets, we introduce an appropriate visualization technique. Both are suitable for interactive exploration of the intrinsic clustering structure offering additional insights into the distribution and correlation of the data.
An efficient k-means clustering algorithm
- In Proceedings of IPPS/SPDP Workshop on High Performance Data Mining
, 1998
"... In this paper, we present a novel algorithm for performing k-means clustering. It organizes all the patterns in a k-d tree structure such that one can find all the patterns which are closest to a given prototype efficiently. The main intuition behind our approach is as follows. All the prototypes ar ..."
Abstract
-
Cited by 37 (0 self)
- Add to MetaCart
In this paper, we present a novel algorithm for performing k-means clustering. It organizes all the patterns in a k-d tree structure such that one can find all the patterns which are closest to a given prototype efficiently. The main intuition behind our approach is as follows. All the prototypes are potential candidates for the closest prototype at the root level. However, for the children of the root node, we may be able to prune the candidate set by using simple geometrical constraints. This approach can be applied recursively until the size of the candidate set is one for each node. Our experimental results demonstrate that our scheme can improve the computational speed of the direct k-means algorithm by an order to two orders of magnitude in the total number of distance calculations and the overall time of computation. 1.
The BANG-Clustering System: Grid-Based Data Analysis
- Proc. Sec. Int. Symp. IDA-97
, 1997
"... . For the analysis of large images the clustering of the data set is a common technique to identify correlation characteristics of the underlying value space. In this paper a new approach to hierarchical clustering of very large data sets is presented. The BANG-Clustering system presented in this pa ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
. For the analysis of large images the clustering of the data set is a common technique to identify correlation characteristics of the underlying value space. In this paper a new approach to hierarchical clustering of very large data sets is presented. The BANG-Clustering system presented in this paper is a novel approach to hierarchical data analysis. It is based on the BANG-Clustering method ([Sch96]) and uses a multidimensional grid data structure to organize the value space surrounding the pattern values. The patterns are grouped into blocks and clustered with respect to the blocks by a topological neighbor search algorithm. 1 Introduction Clustering methods are extremely important for explorative data analysis, which is an important approach for the analysis of images. Previously presented algorithms can be divided into hierarchical algorithms, e.g. single-linkage, completelinkage, etc. and partitional algorithms, e.g. K-MEANS, ISODATA, etc. (see [DJ80]). All of these methods suf...
A Density Based Approach to Classification
- In Proc. 2003 ACM Symposium on applied computing
, 2003
"... This paper presents a novel method for classification, which is density based and makes use of the models built by the lattice machine (LM) [5, 7]. Density is a natural concept to use in clustering and the LM is a relatively new method for supervised learning developed in recent years. The LM approx ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper presents a novel method for classification, which is density based and makes use of the models built by the lattice machine (LM) [5, 7]. Density is a natural concept to use in clustering and the LM is a relatively new method for supervised learning developed in recent years. The LM approximates data resulting in, as a model of data, a set of hyper tuples that are equilabelled, supported and maximal. The method presented in this paper uses the LM model of data to classify new data with a view to maximising the density of the model. In order for the method to have wide applicability a measure of density is introduced for hyper tuples and relations.
STING: A Statistical Information Grid Approach
, 2006
"... � Using multi-resolution grid data structure � Several interesting methods ..."
Abstract
- Add to MetaCart
� Using multi-resolution grid data structure � Several interesting methods
An Efficient K-Means and C-Means Clustering Algorithm for Image Segmentation 1
, 2012
"... In this paper, we present a novel algorithm for performing k-means clustering. It organizes all the patterns in a k-d tree structure such that one can find all the patterns which are closest to a given prototype efficiently. The main intuition behind our approach is as follows. All the prototypes ar ..."
Abstract
- Add to MetaCart
In this paper, we present a novel algorithm for performing k-means clustering. It organizes all the patterns in a k-d tree structure such that one can find all the patterns which are closest to a given prototype efficiently. The main intuition behind our approach is as follows. All the prototypes are potential candidates for the closest prototype at the root level. However, for the children of the root node, we may be able to prune the candidate set by using simple geometrical constraints. This approach can be applied recursively until the size of the candidate set is one for each node. Our experimental results demonstrate that our scheme can improve the computational speed of the direct k-means algorithm by an order to two orders of magnitude in the total number of distance calculations and the overall time of computation.

