## Iterative Optimization and Simplification of Hierarchical Clusterings (1995)

### Cached

### Download Links

- [ftp.mrg.dist.unige.it]
- [www.cs.cmu.edu]
- [cswww.vuse.vanderbilt.edu]
- [www.ece.northwestern.edu]
- [www.cs.cmu.edu]
- [www.jair.org]
- CiteULike
- DBLP

### Other Repositories/Bibliography

Venue: | Journal of Artificial Intelligence Research |

Citations: | 109 - 2 self |

### BibTeX

@ARTICLE{Fisher95iterativeoptimization,

author = {Doug Fisher},

title = {Iterative Optimization and Simplification of Hierarchical Clusterings},

journal = {Journal of Artificial Intelligence Research},

year = {1995},

volume = {4},

pages = {118--123}

}

### Years of Citing Articles

### OpenURL

### Abstract

Clustering is often used for discovering structure in data. Clustering systems differ in the objective function used to evaluate clustering quality and the control strategy used to search the space of clusterings. Ideally, the search strategy should consistently construct clusterings of high quality, but be computationally inexpensive as well. In general, we cannot have it both ways, but we can partition the search so that a system inexpensively constructs a `tentative' clustering for initial examination, followed by iterative optimization, which continues to search in background for improved clusterings. Given this motivation, we evaluate an inexpensive strategy for creating initial clusterings, coupled with several control strategies for iterative optimization, each of which repeatedly modifies an initial clustering in search of a better one. One of these methods appears novel as an iterative optimization strategy in clustering contexts. Once a clustering has been construct...

### Citations

5328 | 5: Programs for Machine Learning - Quinlan, C4 - 1993 |

4313 | Classification and Regression Trees - Breiman, Friedman, et al. - 1984 |

4119 | Pattern Classification and Scene Analysis - Duda, Hart - 1973 |

3591 | Induction of Decision Trees
- QUINLAN
- 1986
(Show Context)
Citation Context ...A i = V ij )]: The information-theoretic analog can be understood as a summation over information gain values, where information gain is an often used selection criterion for decision tree induction (=-=Quinlan, 1986-=-): the clustering analog rewards clusters, C k , that maximize the sum of information gains over the individual variables, A i . Both the Gini and Information Gain measures are often-used bases for se... |

673 | Knowledge acquisition via incremental conceptual clustering
- Fisher
- 1987
(Show Context)
Citation Context ...pears to mitigate problems associated with local maxima as measured by the objective function. For evaluation purposes, we couple these strategies with a simple, inexpensive procedure used by Cobweb (=-=Fisher, 1987-=-a, 1987b) and a system by Anderson and Matessa (1991), which constructs an initial hierarchical clustering. These iterative optimization strategies, however, can be paired with other methods for const... |

374 | Computer Systems that learn - Weiss, Kulikowski - 1991 |

271 | Pattern Classi cation and Scene Analysis - Duda, Hart - 1973 |

258 | Learning from Observations: Conceptual Clustering”, In: Ma- chine Learning an Artificial Intelligence Approach - Michalski, Stepp - 1983 |

249 |
Autoclass: a Bayesian classification system
- Cheeseman, Self, et al.
- 1988
(Show Context)
Citation Context ...ther methods for constructing initial clusterings. Once a clustering has been constructed it is judged by analysts -- often according to task-specific criteria. Several authors (Fisher, 1987a, 1987b; =-=Cheeseman et al., 1988-=-; Anderson & Matessa, 1991) have abstracted these criteria into a generic performance task akin to pattern completion, where the error rate over completed patterns can be used to `externally' judge th... |

205 | Models of incremental concept formation - Gennari, Langley, et al. - 1989 |

180 |
An empirical comparison of selection measures for decision± tree induction
- Mingers
- 1989
(Show Context)
Citation Context ...crease the similarity of observations across clusters (i.e., coupling). Category utility is similar in form to the Gini Index, which has been used in supervised systems that construct decision trees (=-=Mingers, 1989-=-b; Weiss & Kulikowski, 1991). The Gini Index is typically intended to address the issue of how well the values of a variable, A i , predict a priori known class labels in a supervised context. The sum... |

173 |
An empirical comparison of pruning methods for decision tree induction
- Mingers
- 1989
(Show Context)
Citation Context ...crease the similarity of observations across clusters (i.e., coupling). Category utility is similar in form to the Gini Index, which has been used in supervised systems that construct decision trees (=-=Mingers, 1989-=-b; Weiss & Kulikowski, 1991). The Gini Index is typically intended to address the issue of how well the values of a variable, A i , predict a priori known class labels in a supervised context. The sum... |

106 |
A heuristic approach to the discovery of macro-operators
- Iba
- 1989
(Show Context)
Citation Context ...ptimization that iteratively reclassifies single observations, and a third method appears novel in the clustering literature. This latter strategy was inspired, in part, by macro-learning strategies (=-=Iba, 1989-=-) -- collections of observations are reclassified en masse, which appears to mitigate problems associated with local maxima as measured by the objective function. For evaluation purposes, we couple th... |

105 |
Experiments with incremental concept formation: Unimem
- Lebowitz
- 1987
(Show Context)
Citation Context ...t confronted with much the same task. 5.4 Other Issues There are many important issues in clustering that we will not address in depth. One of these is the possible advantage of overlapping clusters (=-=Lebowitz, 1987-=-; Martin & Billman, 1994). We have assumed tree-structured clusterings, which store each observation in more than one cluster, but these clusters are related by a proper subset-of relation as one desc... |

99 | Automated Construction of Classification: Conceptual Clustering Versus Numerical Taxonomy - Michlaski, Stepp - 1983 |

97 | Information, uncertainty, and the utility of categories - Gluck, Corter - 1985 |

64 | GALOIS: An Order-Theoretic Approach to Conceptual Clustering - Carpineto, Romano - 1993 |

56 |
Reconstructive memory: A computer model
- Kolodner
- 1983
(Show Context)
Citation Context ...tegy that can be used in conjunction with an objective function, the general idea of focusing on selected features during classification can be traced back to Unimem (Lebowitz, 1982, 1987) and Cyrus (=-=Kolodner, 1983-=-). The results of Table 8 illustrate the form of an expected classification-cost analysis, but we might have also measured cost as time directly using a test set. In fact, comparisons between the time... |

52 | A distance-based attribute selection measure for decision tree induction - Màntaras, R - 1991 |

48 |
Approaches to conceptual clustering
- Fisher, Langley
- 1985
(Show Context)
Citation Context ...(C k ) X i X j [P (A i = V ij jC k ) 2 \Gamma P (A i = V ij ) 2 ]; and/or variants have been used extensively by a system known as Cobweb (Fisher, 1987a) and many related systems (Gennari, Langley, & =-=Fisher, 1989-=-; McKusick & Thompson, 1990; Iba & Gennari, 1991; McKusick & Langley, 1991; Reich & Fenves, 1991; Biswas, Weinberg, & Li, 1994; De Alte Da Veiga, 1994; Kilander, 1994; Ketterlin, Gancarski, & Korczak,... |

46 | An Improved Algorithm for Incremental Induction of Decision Trees - Utgoff - 1994 |

40 | On the Induction of Decision Trees for Multiple Concept Learning - Fayyad - 1991 |

35 | Intrinsic classification by mml - the snob program - Wallace, Dowe - 1994 |

33 | Bayesian classification with correlation and inheritance - Hanson, Stutz, et al. - 1992 |

32 | Explaining basic categories: Features predictability and information - Corter, Gluck - 1992 |

29 |
A self-organizing retrieval system for graphs
- Levinson
- 1984
(Show Context)
Citation Context ...ructured clusterings, which store each observation in more than one cluster, but these clusters are related by a proper subset-of relation as one descends a path in the tree. In many cases, lattices (=-=Levinson, 1984-=-; Wilcox & Levinson, 1986; Carpineto & Romano, 1993), or more generally, directed acyclic graphs (DAG) may be a better representation scheme. These structures allow an observation to be included in mu... |

27 | Concept simplification and prediction accuracy - Fisher, Schlimmer - 1988 |

23 | Constraints on tree structure in concept formation - McKusick, Langley - 1991 |

22 | Ordering effects in clustering - Fisher, Xu, et al. - 1992 |

21 | Fenves, “The formation and use of abstract concepts in design - Reich, S - 1991 |

20 | Acquiring and Combining Overlapping Concepts - Martin, J, et al. - 1994 |

19 |
Structural principles in categorization
- Medin
- 1983
(Show Context)
Citation Context ...ssification cost). In general, measures motivated by a desire to reduce error rate will also favor cohesion and decoupling; this stems from two aspects of the pattern-completion task (Lebowitz, 1982; =-=Medin, 1983-=-). First, we assign an observation to a cluster based on the known variable values of the observation, which is best facilitated if variable value predictiveness is high across many variables (i.e., c... |

18 | Applying AI clustering to engineering tasks - FISHER, XU, et al. - 1993 |

18 | The structure and formation of natural categories - LANGLEY, P - 1990 |

13 | Conceptual clustering and exploratory data analysis - Biswas, Weinberg, et al. - 1991 |

12 | A Conceptual Clustering Method for Knowledge Discovery in Databases - BISWAS, WEINBERG, et al. - 1995 |

11 | Focused Concept Formation - Gennari - 1989 |

11 | Computational models of concept learning - Fisher, Pizzani - 1991 |

11 | Iterative optimization and simpli- cation of hierarchical clusterings - Fisher - 1996 |

10 |
Correcting erroneous generalizations
- Lebowitz
- 1982
(Show Context)
Citation Context ...offers a principled focusing strategy that can be used in conjunction with an objective function, the general idea of focusing on selected features during classification can be traced back to Unimem (=-=Lebowitz, 1982-=-, 1987) and Cyrus (Kolodner, 1983). The results of Table 8 illustrate the form of an expected classification-cost analysis, but we might have also measured cost as time directly using a test set. In f... |

10 | ITERATE: A conceptual clustering method for knowledge discovery in databases - Biswas, Jerry�, et al. - 1994 |

9 | Concept formation by incremental conceptual clustering - Hadzikadic, Yun - 1989 |

9 | Automated Construction of Classi cations: Conceptual Clustering Versus Numerical Taxonomy - Michalski, Stepp - 1983 |

8 | An Improved Algorithm for Incremental Induction of Decision Trees - Utgo - 1994 |

7 | Hierarchical clustering of composite objects with a variable number of components - Ketterlin, Gangarski, et al. - 1995 |

6 | Database management and analysis tools of machine induction - Fisher, Hapanyengwi - 1993 |

6 | AutoClass: ABayesian Classi cation System - Cheeseman, Kelly, et al. - 1988 |

5 |
Description contrasting in incremental concept formation
- Decaestecker
- 1991
(Show Context)
Citation Context ... stored as singleton clusters at leaves of the tree. Other hierarchical-sort based strategies augment this basic procedure in a manner described in Section 3.3 (Fisher, 1987a; Hadzikadic & Yun, 1989; =-=Decaestecker, 1991-=-). 150 Optimization of Hierarchical Clusterings pyr 0.33 sma 1.00 sqr 0.50 gre 1.00 pyr 0.50 P(C7|C4)=0.50 sma 1.00 pyr 1.00 gre 1.00 P(C8|C4)=0.50 New Object New Object P(root)=1.0 Color Shape Size r... |

5 | Learning to recognize movements - Iba, Gennari - 1991 |

5 | Cobweb/3: A portable implementation (Tech - McKusick - 1990 |