Results 1 - 10
of
31
Improved Heterogeneous Distance Functions
- Journal of Artificial Intelligence Research
, 1997
"... Instance-based learning techniques typically handle continuous and linear input values well, but often do not handle nominal input attributes appropriately. The Value Difference Metric (VDM) was designed to find reasonable distance values between nominal attribute values, but it largely ignores cont ..."
Abstract
-
Cited by 173 (9 self)
- Add to MetaCart
Instance-based learning techniques typically handle continuous and linear input values well, but often do not handle nominal input attributes appropriately. The Value Difference Metric (VDM) was designed to find reasonable distance values between nominal attribute values, but it largely ignores continuous attributes, requiring discretization to map continuous values into nominal values. This paper proposes three new heterogeneous distance functions, called the Heterogeneous Value Difference Metric (HVDM), the Interpolated Value Difference Metric (IVDM), and the Windowed Value Difference Metric (WVDM). These new distance functions are designed to handle applications with nominal attributes, continuous attributes, or both. In experiments on 48 applications the new distance metrics achieve higher classification accuracy on average than three previous distance functions on those datasets that have both nominal and continuous attributes. 1. Introduction Instance-Based Learning (IBL) (Aha, ...
Reduction Techniques for Instance-Based Learning Algorithms
- Machine Learning
, 2000
"... . Instance-based learning algorithms are often faced with the problem of deciding which instances to store for use during generalization. Storing too many instances can result in large memory requirements and slow execution speed, and can cause an oversensitivity to noise. This paper has two main p ..."
Abstract
-
Cited by 93 (2 self)
- Add to MetaCart
. Instance-based learning algorithms are often faced with the problem of deciding which instances to store for use during generalization. Storing too many instances can result in large memory requirements and slow execution speed, and can cause an oversensitivity to noise. This paper has two main purposes. First, it provides a survey of existing algorithms used to reduce storage requirements in instance-based learning algorithms and other exemplar-based algorithms. Second, it proposes six additional reduction algorithms called DROP1--DROP5 and DEL (three of which were first described in Wilson & Martinez, 1997c, as RT1--RT3) that can be used to remove instances from the concept description. These algorithms and 10 algorithms from the survey are compared on 31 classification tasks. Of those algorithms that provide substantial storage reduction, the DROP algorithms have the highest average generalization accuracy in these experiments, especially in the presence of uniform class noise. ...
Prototype Selection for Composite Nearest Neighbor Classifiers
, 1997
"... Combining the predictions of a set of classifiers has been shown to be an effective way to create composite classifiers that are more accurate than any of the component classifiers. Increased accuracy has been shown in a variety of real-world applications, ranging from protein sequence identificatio ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
Combining the predictions of a set of classifiers has been shown to be an effective way to create composite classifiers that are more accurate than any of the component classifiers. Increased accuracy has been shown in a variety of real-world applications, ranging from protein sequence identification to determining the fat content of ground meat. Despite such individual successes, the answers are not known to fundamental questions about classifier combination, such as "Can classifiers from any given model class be combined to create a composite classifier with higher accuracy?" or "Is it possible to increase the accuracy of a given classifier by combining its predictions with those of only a small number o...
Reduction Techniques for Exemplar-Based Learning Algorithms
- MACHINE LEARNING
, 2000
"... Exemplar-based learning algorithms are often faced with the problem of deciding which instances or other exemplars to store for use during generalization. Storing too many exemplars can result in large memory requirements and slow execution speed, and can cause an oversensitivity to noise. This pap ..."
Abstract
-
Cited by 19 (2 self)
- Add to MetaCart
Exemplar-based learning algorithms are often faced with the problem of deciding which instances or other exemplars to store for use during generalization. Storing too many exemplars can result in large memory requirements and slow execution speed, and can cause an oversensitivity to noise. This paper has two main purposes. First, it provides a survey of existing algorithms used to reduce the number of exemplars retained in exemplar-based learning models. Second, it proposes six new reduction algorithms called DROP1-5 and DEL that can be used to prune instances from the concept description. These algorithms and 10 algorithms from the survey are compared on 31 datasets. Of those algorithms that provide substantial storage reduction, the DROP algorithms have the highest generalization accuracy in these experiments, especially in the presence of noise.
An Integrated Instance-Based Learning Algorithm
- Computational Intelligence
, 2000
"... The basic nearest-neighbor rule generalizes well in many domains but has several shortcomings, including inappropriate distance functions, large storage requirements, slow execution time, sensitivity to noise, and an inability to adjust its decision boundaries after storing the training data. This p ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
The basic nearest-neighbor rule generalizes well in many domains but has several shortcomings, including inappropriate distance functions, large storage requirements, slow execution time, sensitivity to noise, and an inability to adjust its decision boundaries after storing the training data. This paper proposes methods for overcoming each of these weaknesses and combines these methods into a comprehensive learning system called the Integrated Decremental Instance-Based Learning Algorithm (IDIBL) that seeks to reduce storage, improve execution speed, and increase generalization accuracy, when compared to the basic nearest neighbor algorithm and other learning models. IDIBL tunes its own parameters using a new measure of fitness that combines confidence and cross-validation (CVC) accuracy in order to avoid discretization problems with more traditional leave-one-out cross-validation (LCV). In our experiments IDIBL achieves higher generalization accuracy than other less comprehensive instance-based learning algorithms, while requiring less than onefourth the storage of the nearest neighbor algorithm and improving execution speed by a corresponding factor. In experiments on 21 datasets, IDIBL also achieves higher generalization accuracy than those reported for 16 major machine learning and neural network models.
Proximity Graphs for Nearest Neighbor Decision Rules: Recent Progress
- Progress”, Proceedings of the 34 th Symposium on the INTERFACE
, 2002
"... In the typical nonparametric approach to pattern classification, random data (the training set of patterns) are collected and used to design a decision rule (classifier). One of the most well known such rules is the k-nearest-neighbor decision rule (also known as instance-based learning, and lazy le ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
In the typical nonparametric approach to pattern classification, random data (the training set of patterns) are collected and used to design a decision rule (classifier). One of the most well known such rules is the k-nearest-neighbor decision rule (also known as instance-based learning, and lazy learning) in which an unknown pattern is classified into the majority class among its k nearest neighbors in the training set. Several questions related to this rule have received considerable attention over the years. Such questions include the following. How can the storage of the training set be reduced without degrading the performance of the decision rule? How should the reduced training set be selected to represent the different classes? How large should k be? How should the value of k be chosen? Should all k neighbors be equally weighted when used to decide the class of an unknown pattern? If not, how should the weights be chosen? Should all the features (attributes) we weighted equally and if not how should the feature weights be chosen? What distance metric should be used? How can the rule be made robust to overlapping classes or noise present in the training data? How can the rule be made invariant to scaling of the measurements? Geometric proximity graphs such as Voronoi diagrams and their many relatives provide elegant solutions to most of these problems. After a brief and non-exhaustive review of some of the classical canonical approaches to solving these problems, the methods that use proximity graphs are discussed, some new observations are made, and avenues for further research are proposed.
A Teaching Strategy for Memory-Based Control
, 1997
"... Combining different machine learning algorithms in the same system can produce benefits above and beyond what either method could achieve alone. This paper demonstrates that genetic algorithms can be used in conjunction with lazy learning to solve examples of a difficult class of delayed reinforceme ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Combining different machine learning algorithms in the same system can produce benefits above and beyond what either method could achieve alone. This paper demonstrates that genetic algorithms can be used in conjunction with lazy learning to solve examples of a difficult class of delayed reinforcement learning problems better than either method alone. This class, the class of differential games, includes numerous important control problems that arise in robotics, planning, game playing, and other areas, and solutions for differential games suggest solution strategies for the general class of planning and control problems. We conducted a series of experiments applying three learning approaches---lazy Q-learning, k-nearest neighbor (k-NN), and a genetic algorithm---to a particular differential game called a pursuit game. Our experiments demonstrate that k-NN had great difficulty solving the problem, while a lazy version of Q-learning performed moderately well and the genetic algorithm pe...
Best-Case Results for Nearest Neighbor Learning
- IEEE Trans. Pattern Anal. Machine Intell
, 1995
"... In this paper we propose a theoretical model for analysis of classification methods, in which the teacher knows the classification algorithm and chooses examples in the best way possible. We apply this model using the nearestneighbor learning algorithm, and develop upper and lower bounds on sample c ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
In this paper we propose a theoretical model for analysis of classification methods, in which the teacher knows the classification algorithm and chooses examples in the best way possible. We apply this model using the nearestneighbor learning algorithm, and develop upper and lower bounds on sample complexity for several different concept classes. For some concept classes, the sample complexity turns out to be exponential even using this best-case model, which implies that the concept class is inherently difficult for the nearest-neighbor algorithm. We identify several geometric properties that make learning certain concepts relatively easy. Finally we discuss the relation of our work to helpful teacher models, its application to decision-tree learning algorithms, and some of its implications for current experimental work. Keywords---machine learning, nearest-neighbor, geometric concepts. I. Introduction Since their introduction in the 1950's, nearest-neighbor (NN) classifiers have be...
Enhancing prototype reduction schemes with LVQ3-type algorithms
- Pattern Recognition
, 2003
"... Abstract—Most of the prototype reduction schemes (PRS), which have been reported in the literature, process the data in its entirety to yield a subset of prototypes that are useful in nearest-neighbor-like classification. Foremost among these are the prototypes for nearest neighbor classifiers, the ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Abstract—Most of the prototype reduction schemes (PRS), which have been reported in the literature, process the data in its entirety to yield a subset of prototypes that are useful in nearest-neighbor-like classification. Foremost among these are the prototypes for nearest neighbor classifiers, the vector quantization technique, and the support vector machines. These methods suffer from a major disadvantage, namely, that of the excessive computational burden encountered by processing all the data. In this paper, we suggest a recursive and computationally superior mechanism referred to as adaptive recursive partitioning (ARP) PRS. Rather than process all the data using a PRS, we propose that the data be recursively subdivided into smaller subsets. This recursive subdivision can be arbitrary, and need not utilize any underlying clustering philosophy. The advantage of ARP PRS is that the PRS processes subsets of data points that effectively sample the entire
A Brief Taxonomy and Ranking of Creative Prototype Reduction Schemes
- PATTERN ANALYSIS AND APPLICATIONS JOURNAL
, 2003
"... Various Prototype Reduction Schemes (PRS) have been reported in the literature. Based on their operating characteristics, these schemes fall into two fairly distinct categories -- those which are of a creative sort, and those which are essentially selective. he norms for evaluating these metho ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Various Prototype Reduction Schemes (PRS) have been reported in the literature. Based on their operating characteristics, these schemes fall into two fairly distinct categories -- those which are of a creative sort, and those which are essentially selective. he norms for evaluating these methods are typically , the reduction rate and the classification accuracy. It is generally believed that the former class of methods is superior to the latter. In this

