Results 1 -
2 of
2
Iterate: A conceptual clustering algorithm for data mining
- IEEE TRANSACTIONS ON SYSTEMS, MAN AND CYBERNETICS
, 1998
"... The data exploration task can be divided into three interrelated subtasks: (i) feature selection, (ii) discovery, and (iii) interpretation. This paper describes an unsupervised discovery method with biases geared toward partitioning objects into clusters that improve interpretability. The algorithm, ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
The data exploration task can be divided into three interrelated subtasks: (i) feature selection, (ii) discovery, and (iii) interpretation. This paper describes an unsupervised discovery method with biases geared toward partitioning objects into clusters that improve interpretability. The algorithm, ITERATE, employs: (i) a data ordering scheme and (ii) an iterative redistribution operator to produce maximally cohesive and distinct clusters. Cohesion or intra-class similarity is measured in terms of the match between individual objects and their assigned cluster prototype. Distinctness or inter-class dissimilarity is measured by an average of the variance of the distribution matchbetween clusters. We demonstrate that interpretability, from a problem solving viewpoint, is addressed by theintra- and interclass measures. Empirical results demonstrate the properties of the discovery algorithm, and its applications to problem solving.
An Intelligent Digital Library System for Biologists
, 2004
"... To aid researchers in obtaining, organizing and managing biological data, we have developed a sophisticated digital library system that utilizes advanced data mining techniques [Stone et al 2004a]. Our digital library system is implemented as a centralized J2EE web application with links to publicly ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
To aid researchers in obtaining, organizing and managing biological data, we have developed a sophisticated digital library system that utilizes advanced data mining techniques [Stone et al 2004a]. Our digital library system is implemented as a centralized J2EE web application with links to publicly accessible data repositories on the Internet. The digital library is based on a framework used for conventional libraries and an objectoriented paradigm, and provides personalized user-centered services based on the user’s areas of interests and preferences. To make personalized service possible, a “user profile” that represents the preferences of an individual user is constructed based upon a user’s past activities, goals indicated by the user, and options. Utilizing these user profiles, our system makes relevant information available to the user in an appropriate form, amount, and level of detail with minimal user effort. The core of our project is an agent architecture that provides advanced services by combining data mining capabilities with domain knowledge in the form of a semantic network [Stone et al 2004b]. The semantic network imparts a knowledge structure through which the system can “reason ” and draw conclusions about biological data objects and provides a federated view of the many disparate databases of interest to biologists. In the development of our semantic network, we have included the concepts from several established controlled vocabularies, chief

