Results 1 -
2 of
2
Incremental Clustering and Dynamic Information Retrieval
, 1997
"... Motivated by applications such as document and image classification in information retrieval, we consider the problem of clustering dynamic point sets in a metric space. We propose a model called incremental clustering which is based on a careful analysis of the requirements of the information retri ..."
Abstract
-
Cited by 129 (3 self)
- Add to MetaCart
Motivated by applications such as document and image classification in information retrieval, we consider the problem of clustering dynamic point sets in a metric space. We propose a model called incremental clustering which is based on a careful analysis of the requirements of the information retrieval application, and which should also be useful in other applications. The goal is to efficiently maintain clusters of small diameter as new points are inserted. We analyze several natural greedy algorithms and demonstrate that they perform poorly. We propose new deterministic and randomized incremental clustering algorithms which have a provably good performance. We complement our positive results with lower bounds on the performance of incremental algorithms. Finally, we consider the dual clustering problem where the clusters are of fixed diameter, and the goal is to minimize the number of clusters. 1 Introduction We consider the following problem: as a sequence of points from a metric...
Novel approaches to unsupervised clustering through the k-windows algorithm
- Knowledge Mining, volume 185 of Studies in Fuzziness and Soft Computing
, 2005
"... Clustering techniques were originally conceived by Aristotle and Theophrastos in the fourth century B.C. and in the 18th century by Linnaeus [6], but it was not until 1939 when one of the first comprehensive foundations of these methods was published [9]. Clustering is a fundamental process in the k ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Clustering techniques were originally conceived by Aristotle and Theophrastos in the fourth century B.C. and in the 18th century by Linnaeus [6], but it was not until 1939 when one of the first comprehensive foundations of these methods was published [9]. Clustering is a fundamental process in the knowledge acquisition domain. It refers to the partitioning of a sets of objects in groups (clusters) such that objects within the same group are more similar to each other than objects in different groups. Even the simplest clustering problems are known to be NP-Hard [1]. For instance the Euclidean k-center problem in the plane is NP-Hard [7]. In general, the clustering problem can be defined as: Given a set S of n points in a d–dimensional metric space (R d, ρ) and an integer k � n, compute a partition Σ of S into k subsets S1,..., Sk, such that Σ has the smallest possible size. Each Si is called a cluster and k is called the number of clusters. We define the size of a cluster Si to be the maximum distance (under the ρ-metric) between a fixed point ci called center of the cluster and a point of Si. The size of a partition is defined as the maximum size of a cluster in the partition.

