Results 1 - 10
of
36
Instance-based learning algorithms
- Machine Learning
, 1991
"... Abstract. Storing and using specific instances improves the performance of several supervised learning algorithms. These include algorithms that learn decision trees, classification rules, and distributed networks. However, no investigation has analyzed algorithms that use only specific instances to ..."
Abstract
-
Cited by 897 (18 self)
- Add to MetaCart
Abstract. Storing and using specific instances improves the performance of several supervised learning algorithms. These include algorithms that learn decision trees, classification rules, and distributed networks. However, no investigation has analyzed algorithms that use only specific instances to solve incremental learning tasks. In this paper, we describe a framework and methodology, called instance-based learning, that generates classification predictions using only specific instances. Instance-based learning algorithms do not maintain a set of abstractions derived from specific instances. This approach extends the nearest neighbor algorithm, which has large storage requirements. We describe how storage requirements can be significantly reduced with, at most, minor sacrifices in learning rate and classification accuracy. While the storage-reducing algorithm performs well on several realworld databases, its performance degrades rapidly with the level of attribute noise in training instances. Therefore, we extended it with a significance test to distinguish noisy instances. This extended algorithm's performance degrades gracefully with increasing noise levels and compares favorably with a noise-tolerant decision tree algorithm.
Experience With a Learning Personal Assistant
, 1994
"... Personal software assistants that help users with tasks like finding information, scheduling calendars, or managing work-flow will require significant customization to each individual user. For example, an assistant that helps schedule a particular user’s calendar will have to know that user’s sched ..."
Abstract
-
Cited by 193 (6 self)
- Add to MetaCart
Personal software assistants that help users with tasks like finding information, scheduling calendars, or managing work-flow will require significant customization to each individual user. For example, an assistant that helps schedule a particular user’s calendar will have to know that user’s scheduling preferences. This paper explores the potential of machine learning methods to automatically create and maintain such customized knowledge for personal software assistants. We describe the design of one particular learning assistant: a calendar manager, called CAP (Calendar APprentice), that learns user scheduling preferences from experience. Results are summarized from approximately five user-years of experience, during which CAP has learned an evolving set of several thousand rules that characterize the scheduling preferences of its users. Based on this experience, we suggest that machine learning methods may play an important role in future personal software assistants.
Efficient approximations for the marginal likelihood of Bayesian networks with hidden variables
- Machine Learning
, 1997
"... We discuss Bayesian methods for learning Bayesian networks when data sets are incomplete. In particular, we examine asymptotic approximations for the marginal likelihood of incomplete data given a Bayesian network. We consider the Laplace approximation and the less accurate but more efficient BIC/MD ..."
Abstract
-
Cited by 155 (9 self)
- Add to MetaCart
We discuss Bayesian methods for learning Bayesian networks when data sets are incomplete. In particular, we examine asymptotic approximations for the marginal likelihood of incomplete data given a Bayesian network. We consider the Laplace approximation and the less accurate but more efficient BIC/MDL approximation. We also consider approximations proposed by Draper (1993) and Cheeseman and Stutz (1995). These approximations are as efficient as BIC/MDL, but their accuracy has not been studied in any depth. We compare the accuracy of these approximations under the assumption that the Laplace approximation is the most accurate. In experiments using synthetic data generated from discrete naive-Bayes models having a hidden root node, we find that (1) the BIC/MDL measure is the least accurate, having a bias in favor of simple models, and (2) the Draper and CS measures are the most accurate. 1
Concept Learning and the Problem of Small Disjuncts
-
, 1995
"... Ideally, definitions induced from examples should consist of all, and only, disjuncts that are meaningful (e.g., as measured by a statistical significance test) and have a low error rate. Existing inductive systems create definitions that are ideal with regard to large disjuncts, but far from ideal ..."
Abstract
-
Cited by 136 (1 self)
- Add to MetaCart
Ideally, definitions induced from examples should consist of all, and only, disjuncts that are meaningful (e.g., as measured by a statistical significance test) and have a low error rate. Existing inductive systems create definitions that are ideal with regard to large disjuncts, but far from ideal with regard to small disjuncts, where a small (large) disjunct is one that correctly classifies few (many) training examples. The problem with small disjuncts is that many of them have high rates of misclassification, and it is difficult to eliminate the error-prone small disjuncts from a definition without adversely affecting other disjuncts in the definition. Various approaches to this problem are evaluated, including the novel approach of using a bias different than the "maximum generality" bias. This approach, and some others, prove partly successful, but the problem of small disjuncts remains open.
Data Perturbation for Escaping Local Maxima in Learning
- IN AAAI
, 2002
"... Almost all machine learning algorithms---be they for regression, classification or density estimation---seek hypotheses that optimize a score on training data. In most interesting cases, however, full global optimization is not feasible and local search techniques are used to discover reasonable ..."
Abstract
-
Cited by 29 (3 self)
- Add to MetaCart
Almost all machine learning algorithms---be they for regression, classification or density estimation---seek hypotheses that optimize a score on training data. In most interesting cases, however, full global optimization is not feasible and local search techniques are used to discover reasonable solutions. Unfortunately,
Toward a Unified Theory of Learning: Multistrategy Task-Adaptive Learning
- IN: READINGS IN KNOWLEDGE ACQUISITION AND
, 1993
"... Any learning process can be viewed as a self-modification of the leaxnefs current knowledge tArough an. interaction with some information source. Such knowledge modification is guided by the learner's deshe to achieve a certain outcome, and can engage any kind of inference. The type of inference inv ..."
Abstract
-
Cited by 28 (9 self)
- Add to MetaCart
Any learning process can be viewed as a self-modification of the leaxnefs current knowledge tArough an. interaction with some information source. Such knowledge modification is guided by the learner's deshe to achieve a certain outcome, and can engage any kind of inference. The type of inference involved depends on he input information, the current (background) knowledge and the learneFs task ax hand. Based on such a view of learning, several fundamental concepts are analized and clarified, in paxticular, analytic and synthetic learning, derivm:ional and hypothetical explanation, constnictive induction, abduction, abstraction and deductive generalization. It is shown that inductive generalization and abduction can be viewed as two basic forms of general induction, and that abstraction and deductive generalization axe two related forms of constructive deduction. Using this conceptual framework, a methodology for multistrategy task-adaptive learning (MTL) is outlined, in which learning strategies axe combined dynamically, depending on the current learning situation. Speccally, an MTL learner anaLizes a "wiad" relationship among the input information, the background knowledge and the learning task, and on that basis determines which strategy, or. a combination thereof, is most appropriate at a given learning step. To implement the MTL methodology, a new knowledge representation is proposed, based on the parametric association rules (PARs). Basic ideas of MTL are illustrated by means of the well-known "cup" example, through which is shown how an MTL learner can employ, depending the above mad relationship, emprical learning, constructive inductive generalization, abduction, explanation-based learning and absuaction.
History of success and current context in problem solving: Combined influences on operator selection
- Cognitive Psychology
, 1996
"... Problem solvers often have multiple operators available to them but must select just one to apply. We present three experiments that demonstrate that solvers use at least two sources of information to make operator selections in the building sticks task (BST): information from their past history of ..."
Abstract
-
Cited by 28 (7 self)
- Add to MetaCart
Problem solvers often have multiple operators available to them but must select just one to apply. We present three experiments that demonstrate that solvers use at least two sources of information to make operator selections in the building sticks task (BST): information from their past history of using the operators and information from the current context of the problem. Specifically, problem solvers are more likely to use an operator the more successful it has been in the past and the closer it takes the current state to the goal state. These two effects, respectively, represent the learning and performance processes that influence solvers ’ operator selections. A computational model of BST problem solving, developed within the ACT-R theory (Anderson, 1993), provides the unifying framework in which both types of processes can be integrated to predict solvers ’ selection tendencies. � 1996 Academic Press, Inc. Most problems can be approached in multiple ways but solved by only a few. Problem solving can be viewed, then, as finding one of the few paths that leads from a problem’s initial state to its goal state through some space of possible intermediate states (Newell & Simon, 1972). In this framework,
Relating case-based problem solving and learning methods to task and domain characteristics: Towards an analytic framework. AICom
- Artificial Intelligence Communications
, 1996
"... A particular strength of case-based reasoning (CBR) over most other methods is its inherent combination of problem solving with sustained learning through problem solving experience. This is therefore a particularly important topic of study, and an issue that has now become mature enough to be addre ..."
Abstract
-
Cited by 13 (9 self)
- Add to MetaCart
A particular strength of case-based reasoning (CBR) over most other methods is its inherent combination of problem solving with sustained learning through problem solving experience. This is therefore a particularly important topic of study, and an issue that has now become mature enough to be addressed in a more systematic way. To enable such an analysis of problem solving and learning, we have initiated work towards the development of an analytic framework for studying CBR methods. It provides an explicit ontology of basic CBR task types, domain characterisations, and types of problem solving and learning methods. Further, it incorporates within this framework a methodology for combining a knowledge-level, top-down analysis with a bottom-up, case-driven one. In this article, we present the underlying view and the basic approach being taken, the main components of the framework and accompanying methodology, examples of studies recently done and how they relate to the framework. 1.
Learning Flexible Concepts from Streams of Examples: FLORA2
, 1992
"... FLORA2 is a program for supervised learning of concepts that are subject to concept drift. The learning process is incremental in that the examples are processed one by one. A special feature of our program consists in keeping in memory a subset of examples -- a window. In time, new examples are bei ..."
Abstract
-
Cited by 13 (4 self)
- Add to MetaCart
FLORA2 is a program for supervised learning of concepts that are subject to concept drift. The learning process is incremental in that the examples are processed one by one. A special feature of our program consists in keeping in memory a subset of examples -- a window. In time, new examples are being added to the window while other ones are considered outdated and are forgotten. In order to track the concept drift, the system keeps in memory not only valid descriptions of the concepts as they are derived from the objects currently present in the window, but also `candidate descriptions' that may turn into valid descriptions in the future. 1 Introduction One of the key tasks of the Machine Learning discipline is to find powerful methods for abstracting concepts out of a set of objects. Basically, two subproblems of this task exist: supervised learning and unsupervised learning. The former assumes that a set of preclassified examples (positive and negative) of some concept(s) are avail...

