Results 1 - 10
of
59
Knowledge acquisition via incremental conceptual clustering
- Machine Learning
, 1987
"... hill climbing Abstract. Conceptual clustering is an important way of summarizing and explaining data. However, the recent formulation of this paradigm has allowed little exploration of conceptual clustering as a means of improving performance. Furthermore, previous work in conceptual clustering has ..."
Abstract
-
Cited by 569 (5 self)
- Add to MetaCart
hill climbing Abstract. Conceptual clustering is an important way of summarizing and explaining data. However, the recent formulation of this paradigm has allowed little exploration of conceptual clustering as a means of improving performance. Furthermore, previous work in conceptual clustering has not explicitly dealt with constraints imposed by real world environments. This article presents COBWEB, a conceptual clustering system that organizes data so as to maximize inference ability. Additionally, COBWEB is incremental and computationally economical, and thus can be flexibly applied in a variety of domains. 1.
Clustering with instance-level constraints
- In Proceedings of the Seventeenth International Conference on Machine Learning
, 2000
"... One goal of research in artificial intelligence is to automate tasks that currently require human expertise; this automation is important because it saves time and brings problems that were previously too large to be solved into the feasible domain. Data analysis, or the ability to identify meaningf ..."
Abstract
-
Cited by 116 (6 self)
- Add to MetaCart
One goal of research in artificial intelligence is to automate tasks that currently require human expertise; this automation is important because it saves time and brings problems that were previously too large to be solved into the feasible domain. Data analysis, or the ability to identify meaningful patterns and trends in large volumes of data, is an important task that falls into this category. Clustering algorithms are a particularly useful group of data analysis tools. These methods are used, for example, to analyze satellite images of the Earth to identify and categorize different land and foliage types or to analyze telescopic observations to determine what distinct types of astronomical bodies exist and to categorize each observation. However, most existing clustering methods apply general similarity techniques rather than making use of problem-specific information. This dissertation first presents a novel method for converting existing clustering algorithms into constrained clustering algorithms. The resulting methods are able to accept domain-specific information in the form of constraints on the output clusters. At the most general level, each constraint is an instance-level statement
Iterative Optimization and Simplification of Hierarchical Clusterings
- Journal of Artificial Intelligence Research
, 1995
"... Clustering is often used for discovering structure in data. Clustering systems differ in the objective function used to evaluate clustering quality and the control strategy used to search the space of clusterings. Ideally, the search strategy should consistently construct clusterings of high qual ..."
Abstract
-
Cited by 96 (1 self)
- Add to MetaCart
Clustering is often used for discovering structure in data. Clustering systems differ in the objective function used to evaluate clustering quality and the control strategy used to search the space of clusterings. Ideally, the search strategy should consistently construct clusterings of high quality, but be computationally inexpensive as well. In general, we cannot have it both ways, but we can partition the search so that a system inexpensively constructs a `tentative' clustering for initial examination, followed by iterative optimization, which continues to search in background for improved clusterings. Given this motivation, we evaluate an inexpensive strategy for creating initial clusterings, coupled with several control strategies for iterative optimization, each of which repeatedly modifies an initial clustering in search of a better one. One of these methods appears novel as an iterative optimization strategy in clustering contexts. Once a clustering has been construct...
Concept Formation in Structured Domains
, 1991
"... ions are made over the structural information (relations) ..."
Abstract
-
Cited by 48 (2 self)
- Add to MetaCart
ions are made over the structural information (relations)
Efficient Feature Selection in Conceptual Clustering
- In Proceedings of the Fourteenth International Conference on Machine Learning
, 1997
"... Feature selection has proven to be a valuable technique in supervised learning for improving predictive accuracy while reducing the number of attributes considered in a task. We investigate the potential for similar benefits in an unsupervised learning task, conceptual clustering. The issues raised ..."
Abstract
-
Cited by 39 (0 self)
- Add to MetaCart
Feature selection has proven to be a valuable technique in supervised learning for improving predictive accuracy while reducing the number of attributes considered in a task. We investigate the potential for similar benefits in an unsupervised learning task, conceptual clustering. The issues raised in feature selection by the absence of class labels are discussed and an implementation of a sequential feature selection algorithm based on an existing conceptual clustering system is described. Additionally, we present a second implementation which employs a technique for improving the efficiency of the search for an optimal description and compare the performance of both algorithms. 1 Introduction The choice of which attributes to use in describing a given input has crucial impact on the classes induced by a learner. For this reason, the majority of real-world data sets used in inductive learning research have been constructed by domain experts and contain only those attributes which are...
An evaluation of techniques for clustering search results
, 1996
"... The ability to effectively organize retrieval results becomes more important as the focus of Information Retrieval (IR) shifts towards interactive search processes. Automatic classification techniques are capable of providing the necessary information organization by arranging the retrieved data int ..."
Abstract
-
Cited by 35 (3 self)
- Add to MetaCart
The ability to effectively organize retrieval results becomes more important as the focus of Information Retrieval (IR) shifts towards interactive search processes. Automatic classification techniques are capable of providing the necessary information organization by arranging the retrieved data into groups of documents with common subjects. In this paper, we compare classification methods from IR and Machine Learning (ML) for clustering search results. Issues such as document representation, classification algorithms, and cluster representation are discussed. We introduce several evaluation techniques and use them in preliminary experiments. These experiments indicate that the proposed techniques have promise, but it is clear that user experiments are required to carry out more thorough evaluation.
Requirements for Clustering Data Streams
"... Scientific and industrial examples of data streams abound in astronomy, telecommunication operations, banking and stock-market applications, e-commerce and other fields. A challenge imposed by continuously arriving data streams is to analyze them and to modify the models that explain them as new dat ..."
Abstract
-
Cited by 25 (0 self)
- Add to MetaCart
Scientific and industrial examples of data streams abound in astronomy, telecommunication operations, banking and stock-market applications, e-commerce and other fields. A challenge imposed by continuously arriving data streams is to analyze them and to modify the models that explain them as new data arrives. In this paper, we analyze the requirements needed for clustering data streams. We review some of the latest algorithms in the literature and assess if they meet these requirements.
A Design for the Icarus Architecture
, 1991
"... plans are probabilistic summaries of specific plans, containing pointers to their components -- abstract states, operators, and subplans -- along with associated probabilities. For example, a generic plan for picking up an object (a manipulation plan) might have three subproblems, analogous to the e ..."
Abstract
-
Cited by 23 (6 self)
- Add to MetaCart
plans are probabilistic summaries of specific plans, containing pointers to their components -- abstract states, operators, and subplans -- along with associated probabilities. For example, a generic plan for picking up an object (a manipulation plan) might have three subproblems, analogous to the event described above. Icarus uses the same approach to store route knowledge (navigation plans), with places acting as states and with operators like move and turn. Components of Icarus Our designs for the Icarus architecture call for three main components: a perceptual system (Argus), a planning system (Daedalus), and an execution system (Maeander). Argus and Daedalus invoke the memory system (Labyrinth) to retrieve structured experiences from long-term memory, which include objects, states, and plans. 1 Labyrinth first sorts each component of an experience through memory, starting at the root node of the memory hierarchy. At each level, the memory system uses an evaluation function ca...
Constraints on tree structure in concept formation
- Proceedings of the Twelfth International Joint Conference on Artificial Intelligence (pp. 810--816
, 1991
"... We describe ARACHNE, a concept formation system that, uses explicit constraints on tree structure and local restructuring operators to produce well-formed probabilistic concept trees. We also present a quantitative measure of tree quality and compare the system's performance in artificial and natura ..."
Abstract
-
Cited by 22 (0 self)
- Add to MetaCart
We describe ARACHNE, a concept formation system that, uses explicit constraints on tree structure and local restructuring operators to produce well-formed probabilistic concept trees. We also present a quantitative measure of tree quality and compare the system's performance in artificial and natural domains to that of COBWEB, a well-known concept formation algorithm. The results suggest that ARACHNE frequently constructs higher-quality trees than COBWEB, while still retaining the ability to make accurate predictions. 1
Ontology Discovery for the Semantic Web Using Hierarchical Clustering
, 2001
"... According to a proposal by Tim Berners-Lee, the World Wide Web should be extended to make a Semantic Web where human understandable content is structured in such a way as to make it machine processable. Central to this conception is the establishment of shared ontologies, which specify the funda ..."
Abstract
-
Cited by 21 (2 self)
- Add to MetaCart
According to a proposal by Tim Berners-Lee, the World Wide Web should be extended to make a Semantic Web where human understandable content is structured in such a way as to make it machine processable. Central to this conception is the establishment of shared ontologies, which specify the fundamental objects and relations important to particular online communities.

