Results 1  10
of
23
Data Clustering: A Review
 ACM COMPUTING SURVEYS
, 1999
"... Clustering is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reflects its broad appeal and usefulness as one of the steps in exp ..."
Abstract

Cited by 1284 (13 self)
 Add to MetaCart
Clustering is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reflects its broad appeal and usefulness as one of the steps in exploratory data analysis. However, clustering is a difficult problem combinatorially, and differences in assumptions and contexts in different communities has made the transfer of useful generic concepts and methodologies slow to occur. This paper presents an overview of pattern clustering methods from a statistical pattern recognition perspective, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners. We present a taxonomy of clustering techniques, and identify crosscutting themes and recent advances. We also describe some important applications of clustering algorithms such as image segmentation, object recognition, and information retrieval.
Computational geometry  a survey
 IEEE TRANSACTIONS ON COMPUTERS
, 1984
"... We survey the state of the art of computational geometry, a discipline that deals with the complexity of geometric problems within the framework of the analysis ofalgorithms. This newly emerged area of activities has found numerous applications in various other disciplines, such as computeraided de ..."
Abstract

Cited by 19 (3 self)
 Add to MetaCart
We survey the state of the art of computational geometry, a discipline that deals with the complexity of geometric problems within the framework of the analysis ofalgorithms. This newly emerged area of activities has found numerous applications in various other disciplines, such as computeraided design, computer graphics, operations research, pattern recognition, robotics, and statistics. Five major problem areasconvex hulls, intersections, searching, proximity, and combinatorial optimizationsare discussed. Seven algorithmic techniques incremental construction, planesweep, locus, divideandconquer, geometric transformation, pruneandsearch, and dynamizationare each illustrated with an example.Acollection of problem transformations to establish lower bounds for geometric problems in the algebraic computation/decision model is also included.
Clustering in Massive Data Sets
 Handbook of massive data sets
, 1999
"... We review the time and storage costs of search and clustering algorithms. We exemplify these, based on casestudies in astronomy, information retrieval, visual user interfaces, chemical databases, and other areas. Sections 2 to 6 relate to nearest neighbor searching, an elemental form of clustering, ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
We review the time and storage costs of search and clustering algorithms. We exemplify these, based on casestudies in astronomy, information retrieval, visual user interfaces, chemical databases, and other areas. Sections 2 to 6 relate to nearest neighbor searching, an elemental form of clustering, and a basis for clustering algorithms to follow. Sections 7 to 11 review a number of families of clustering algorithm. Sections 12 to 14 relate to visual or image representations of data sets, from which a number of interesting algorithmic developments arise.
Experiments with computing geometric minimum spanning trees
 In Proceedings of ALENEX'00, Lecture Notes in Computer Science
, 2000
"... Let S be a set of n points in! d. We present an algorithm that uses the wellseparated pair decomposition and computes the minimum spanning tree of S under any Lp or polyhedral metric. It has an expected running time of O(n log n) for uniform distributions. Experimental results show that this approa ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
Let S be a set of n points in! d. We present an algorithm that uses the wellseparated pair decomposition and computes the minimum spanning tree of S under any Lp or polyhedral metric. It has an expected running time of O(n log n) for uniform distributions. Experimental results show that this approach is practical. Under a variety of input distributions, the resulting implementation is robust and performs well for points in higher dimensional space.
A Probabilistic Minimum Spanning Tree Algorithm
 Information Processing Letters
, 1978
"... This paper is concerned with the problem of computing spanning tree (MST) for n points in a pdimensional space where the "distance" between each pair of points i and j satisfies the relationship' dq max {Ixti  xtql} , where xki is the coordinate of object i along the ktti dimension. This relatio ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
This paper is concerned with the problem of computing spanning tree (MST) for n points in a pdimensional space where the "distance" between each pair of points i and j satisfies the relationship' dq max {Ixti  xtql} , where xki is the coordinate of object i along the ktti dimension. This relationship is clearly satisfied by all Minkowski metrics dq = [ Ixki  xnjl r] x/r, r > 1
Geometric Minimum Spanning Trees via WellSeparated Pair Decompositions
 ACM JOURNAL OF EXPERIMENTAL ALGORITHMICS
, 2001
"... ..."
Fast FullSearch Equivalent NearestNeighbour Search Algorithms
, 1999
"... A fundamental activity common to many image processing, pattern classification, and clustering algorithms involves searching a set of n, kdimensional data for the one which is nearest to a given target item with respect to a distance function. Our goal is to find fast search algorithms which are fu ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
A fundamental activity common to many image processing, pattern classification, and clustering algorithms involves searching a set of n, kdimensional data for the one which is nearest to a given target item with respect to a distance function. Our goal is to find fast search algorithms which are fullsearch equivalentthat is, the resulting match is as good as what we could obtain if we were to search the set exhaustively. We propose a framework made up of three components, namely (i) a technique for obtaining a good initial match, (ii) an inexpensive method for determining whether the current match is a fullsearch equivalent match, and (iii) an effective technique for improving the current match. Our approach is to consider good solutions for each component in order to find an algorithm which balances the overall complexity of the search. We also propose a technique for hierarchical ordering and cluster elimination using a minimal cost spanning tree. Our experiments on vector quantisation coding of images show that the framework and techniques we proposed can be used to construct suitable algorithms for most of our data sets which require fullsearch equivalent matches at an average arithmetic cost of less than O(k log n) while using only O(n) space.
Initialization free graph based clustering
 Laboratoire I3S, CNRS, Universitè de NiceSophia Antipolis
, 2009
"... ..."
Computer Society.
"... Clustering is the unsupervised classi cation of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines � this re ects its broad appeal and usefulness as one of the steps in expl ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Clustering is the unsupervised classi cation of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines � this re ects its broad appeal and usefulness as one of the steps in exploratory data analysis. However, clustering is a di cult problem combinatorially and di erences in assumptions and contexts in di erent communities has made the transfer of useful generic concepts and methodologies slow to occur. This paper presents an overview of pattern clustering methods from a statistical pattern recognition perspective, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners. We present a taxonomy of clustering techniques and identify crosscutting themes and recent advances. We also describe some important applications of clustering algorithms such as image segmentation, object recognition, and information retrieval.
Programming a Best Fitting Two Dimensional Polyline
"... This report details the steps used to find this least squares polyline. In Section 2 the graph theoretic and least squares algorithms are described. Section 3 details the implementation of the algorithms in C++ and data structures associated with them. An example on a data set is shown in Section 4. ..."
Abstract
 Add to MetaCart
This report details the steps used to find this least squares polyline. In Section 2 the graph theoretic and least squares algorithms are described. Section 3 details the implementation of the algorithms in C++ and data structures associated with them. An example on a data set is shown in Section 4. For more information on the rudimentary C++ classes and matrix functions used in this coding, the reader is referred to the C++ header files in Section 5, and the C++ algorithm code in Section 6. Section 7 contains a few examples that demonstrate that this code operates properly. Finally, Section 8 will comment on extensions that can be supplemented to this for improvement. 2 Description of the algorithms