Results 11 - 20
of
21
Instance-Based Learning: Nearest Neighbour with Generalisation
, 1995
"... Instance-based learning is a machine learning method that classifies new examples by comparing them to those already seen and in memory. There are two types of instance-based learning; nearest neighbour and case-based reasoning. Of these two methods, nearest neighbour fell into disfavour during the ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
Instance-based learning is a machine learning method that classifies new examples by comparing them to those already seen and in memory. There are two types of instance-based learning; nearest neighbour and case-based reasoning. Of these two methods, nearest neighbour fell into disfavour during the 1980s, but regained popularity recently due to its simplicity and ease of implementation. Nearest neighbour learning is not without problems. It is difficult to define a distance function that works well for both discrete and continuous attributes. Noise and irrelevant attributes also pose problems. Finally, the specificity bias adopted by instance-based learning, while often an advantage, can over-represent small rules at the expense of more general concepts, leading to a marked decrease in classification performance for some domains. Generalised exemplars offer a solution. Examples that share the same class are grouped together, and so represent large rules more fully. This reduces the rol...
Feature Selection as Retrospective Pruning in Hierarchical Clustering
- In Third International Symposium on Intelligent Data Analysis, IDA 99
, 1999
"... . Although feature selection is a central problem in inductive learning as suggested by the growing amount of research in this area, most of the work has been carried out under the supervised learning paradigm, paying little attention to unsupervised learning tasks and, particularly, clustering task ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
. Although feature selection is a central problem in inductive learning as suggested by the growing amount of research in this area, most of the work has been carried out under the supervised learning paradigm, paying little attention to unsupervised learning tasks and, particularly, clustering tasks. In this paper, we analyze the particular benefits that feature selection may provide in hierarchical clustering. We propose a view of feature selection as a tree pruning process similar to those used in decision tree learning. Under this framework, we perform several experiments using different pruning strategies and considering a multiple prediction task. Results suggest that hierarchical clusterings can be greatly simplified without diminishing accuracy. 1 Introduction The widespread use of information technologies produces an growing amount of data which is too huge to be analyzed by manual methods. There are large volumes of data containing both, many features and many examples. Indu...
Expected Error Analysis for Model Selection
- International Conference on Machine Learning (ICML
, 1999
"... In order to select a good hypothesis language (or model) from a collection of possible models, one has to assess the generalization performance of the hypothesis which is returned by a learner that is bound to use some particular model. This paper deals with a new and very efficient way of assessing ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
In order to select a good hypothesis language (or model) from a collection of possible models, one has to assess the generalization performance of the hypothesis which is returned by a learner that is bound to use some particular model. This paper deals with a new and very efficient way of assessing this generalization performance. We present a new analysis which characterizes the expected generalization error of the hypothesis with least training error in terms of the distribution of error rates of the hypotheses in the model. This distribution can be estimated very efficiently from the data which immediately leads to an efficient model selection algorithm. The analysis predicts learning curves with a very high precision and thus contributes to a better understanding of why and when over-fitting occurs. We present empirical studies (controlled experiments on Boolean decision trees and a large-scale text categorization problem) which show that the model selection algorithm leads to err...
Is consistency harmful
- In Proceedings of the ML92 Workshop on Biases in Inductive Learning
, 1992
"... One of the major goals of most early concept learners was to find hypotheses that were perfectly consistent with the training data. It was believed that this goal would indirectly achieve a high degree of predictive accuracy on a set of test data. Later research has partially disproved this belief. ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
One of the major goals of most early concept learners was to find hypotheses that were perfectly consistent with the training data. It was believed that this goal would indirectly achieve a high degree of predictive accuracy on a set of test data. Later research has partially disproved this belief. However, the issue of consistency has not yet been resolved completely. We examine the issue of consistency from a new perspective. To avoid overfitting the training data, a considerable number of current systems have sacrificed the goal of learning hypotheses that are perfectly consistent with the training instances by setting a new goal of hypothesis simplicity (Occam’s razor). Instead of using simplicity as a goal, we have developed a novel approach that addresses consistency directly. In other words, our concept learner has the explicit goal of selecting the most appropriate degree of consistency with the training data. We begin this paper by exploring concept learning with less than perfect consistency. Next, we describe a system that can adapt its degree of consistency in response to feedback about predictive accuracy on test data. Finally, we present the results of initial experiments that begin to address the question of how tightly hypotheses should fit the training data for different problems. 1
Pessimistic and Optimistic Induction
, 1992
"... : Learning methods vary in the optimism or pessimism with which they regard the informativeness of learned knowledge. Pessimism is implicit in hypothesis testing, where we wish to draw cautious conclusions from experimental evidence. However, this paper demonstrates that optimism in the utility of d ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
: Learning methods vary in the optimism or pessimism with which they regard the informativeness of learned knowledge. Pessimism is implicit in hypothesis testing, where we wish to draw cautious conclusions from experimental evidence. However, this paper demonstrates that optimism in the utility of derived rules may be the preferred bias for learning systems themselves. We examine the continuum between naive pessimism and naive optimism in the context of a decision tree learner that prunes rules based on stringent (i.e., pessimistic) or weak (i.e., optimistic) tests of their significance. Our experimental results indicate that in most cases optimism is preferred, but particularly in cases of sparse training data and high noise. This work generalizes earlier findings by Fisher and Schlimmer (1988) and Schaffer (1992), and we discuss its relevance to unsupervised learning, small disjuncts, and other issues. Keywords: Inductive learning, classification, decision-tree learning, decisionlis...
Concept Reliability in Machine Learning
- Proceedings of the Second Midwest Artificial Intelligence and Cognitive Science Society Conference. J. Dinsmore and T. Koschmann (Eds
, 1990
"... Introduction Much machine learning research addresses inductive learning --- learning relationships from a set of examples (Michalski (1986) provides an excellent introduction). For instance, some programs have been used to learn medical diagnostic rules from a database of patients whose diagnoses a ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Introduction Much machine learning research addresses inductive learning --- learning relationships from a set of examples (Michalski (1986) provides an excellent introduction). For instance, some programs have been used to learn medical diagnostic rules from a database of patients whose diagnoses are known. These programs examine a number of attributes (e.g. age, temperature, and pulse rate) for a set of examples whose classification (e.g. diagnosis) is known. This set of examples is termed a training set. Attributes tests are combined into logical rules which are used to predict the classification (e.g. if (age > 5) and (temperature > 100), then (preliminary-diagnosis = not-normal)). These rules are generally termed concepts. Reliability and induction The probability that a given concept will accurately classify a training set by chance alone, denoted here as P, is a fundamental cha
Learning Two-Tiered Descriptions of Imprecise Concepts: A Method Employing Examples of Varied Typicality and an Optimized Base Concept Representation: Part I: Principles and Methodology
"... A method for learning flexible concepts is described, that is concepts that are imprecise and context dependent. The method is based on a two-tiered concept representation. In such a representation the first tier, called the Base Concept Representation, describes typical properties of a concept in a ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
A method for learning flexible concepts is described, that is concepts that are imprecise and context dependent. The method is based on a two-tiered concept representation. In such a representation the first tier, called the Base Concept Representation, describes typical properties of a concept in an explicit, comprehensible, and efficient form. The second tier, called the Inferential Concept Interpretation, contains inference rules and metaknowledge that define allowable transformations of the concept under different contexts, and handle exceptional instances.
William M. Spears Diana F. Gordon Naval Research Laboratory Washington, D.C. 20375 USA (202) 767-9006 spears@aic.nrl.navy.mil gordon@aic.nrl.navy.mil
- Naval Research Laboratory, Navy Center for
, 1994
"... One of the major goals of most early concept learners was to find hypotheses that were perfectly consistent with the training data. It was believed that this goal would indirectly achieve a high degree of predictive accuracy on a set of test data. Later research has partially disproved this belief. ..."
Abstract
- Add to MetaCart
One of the major goals of most early concept learners was to find hypotheses that were perfectly consistent with the training data. It was believed that this goal would indirectly achieve a high degree of predictive accuracy on a set of test data. Later research has partially disproved this belief. However, the issue of consistency has not yet been resolved completely. We examine the issue of consistency from a new perspective. To avoid overfitting the training data, a considerable number of current systems have sacrificed the goal of learning hypotheses that are perfectly consistent with the training instances by setting a goal of hypothesis simplicity (Occam's razor). Instead of using simplicity as a goal, we have developed a novel approach that addresses consistency directly. In other words, our concept learner has the explicit goal of selecting the most appropriate degree of consistency with the training data. We begin this paper by exploring concept learning with less than perfect...
Learning Flexible Concepts Using A Two-Tiered Representation
, 1993
"... Most human concepts are flexible in the sense that they inherently lack: precise boundaries, and these boundaries are often contextdependent. This chapter describes 'a method for representing and inductively learning flexible concepts from examples. The basic idea is to represent such concepts using ..."
Abstract
- Add to MetaCart
Most human concepts are flexible in the sense that they inherently lack: precise boundaries, and these boundaries are often contextdependent. This chapter describes 'a method for representing and inductively learning flexible concepts from examples. The basic idea is to represent such concepts using a two-tiered representation.. Such a representation consists of two structures ("tiers"): the Base Concept Representation (BCR), which captures explicitly the basic and context- independent concept properties, and Inferential Concept Interpretation (ICI), which :haracterizes allowable concept modifications and contextdependency. The proposed method has been implemented in the POSEIDON 3 system (also called AQ16), and tested on various practical problems, such as learning the concept of "Acceptable union contracts" and "Voting patterns of Republicans and Democrats in the U.S. Congress." In the experiments, the system generated concept descriptions that were both, more accurate and simpler than those produced by other methods tested, such as methods employing simple exemplar-based representations, decision tree learning, and some previous methods for rule learning.

