Results 1 -
3 of
3
Iterate: A conceptual clustering algorithm for data mining
- IEEE TRANSACTIONS ON SYSTEMS, MAN AND CYBERNETICS
, 1998
"... The data exploration task can be divided into three interrelated subtasks: (i) feature selection, (ii) discovery, and (iii) interpretation. This paper describes an unsupervised discovery method with biases geared toward partitioning objects into clusters that improve interpretability. The algorithm, ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
The data exploration task can be divided into three interrelated subtasks: (i) feature selection, (ii) discovery, and (iii) interpretation. This paper describes an unsupervised discovery method with biases geared toward partitioning objects into clusters that improve interpretability. The algorithm, ITERATE, employs: (i) a data ordering scheme and (ii) an iterative redistribution operator to produce maximally cohesive and distinct clusters. Cohesion or intra-class similarity is measured in terms of the match between individual objects and their assigned cluster prototype. Distinctness or inter-class dissimilarity is measured by an average of the variance of the distribution matchbetween clusters. We demonstrate that interpretability, from a problem solving viewpoint, is addressed by theintra- and interclass measures. Empirical results demonstrate the properties of the discovery algorithm, and its applications to problem solving.
Iterate: A conceptual clustering method for knowledge discovery in databases
- In Braunschweig, B., & Day, R. (Eds.), Innovative Applications of Artificial Intelligence in the Oil and Gas Industry
, 1995
"... ..."
Conceptual Clustering with Numeric-and-Nominal Mixed Data - A New Similarity Based System
- in IEEE Transcript on KCE
, 1998
"... This paper presents a new Similarity Based Agglomerative Clustering(SBAC) algorithm that works well for data with mixed numeric and nominal features. A similarity measure, proposed by Goodall for biological taxonomy[13], that gives greater weight to uncommon feature-value matches in similarity compu ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
This paper presents a new Similarity Based Agglomerative Clustering(SBAC) algorithm that works well for data with mixed numeric and nominal features. A similarity measure, proposed by Goodall for biological taxonomy[13], that gives greater weight to uncommon feature-value matches in similarity computations and makes no assumptions of the underlying distributions of the feature-values, is adopted to define the similarity measure between pairs of objects. An agglomerative algorithm is employed to construct a concept tree, and a simple distinctness heuristic is used to extract a partition of the data. The performance of SBAC has been studied on artificially generated data sets. Results demonstrate the effectiveness of this algorithm in unsupervised discovery tasks. Comparisons with other schemes illustrate the superior performance of the algorithm. 1 Introduction The widespread use of computers and information technology has made extensive data collection in businesses, manufacturing, an...

