Results 1  10
of
224
Instancebased learning algorithms
 Machine Learning
, 1991
"... Abstract. Storing and using specific instances improves the performance of several supervised learning algorithms. These include algorithms that learn decision trees, classification rules, and distributed networks. However, no investigation has analyzed algorithms that use only specific instances to ..."
Abstract

Cited by 1053 (18 self)
 Add to MetaCart
Abstract. Storing and using specific instances improves the performance of several supervised learning algorithms. These include algorithms that learn decision trees, classification rules, and distributed networks. However, no investigation has analyzed algorithms that use only specific instances to solve incremental learning tasks. In this paper, we describe a framework and methodology, called instancebased learning, that generates classification predictions using only specific instances. Instancebased learning algorithms do not maintain a set of abstractions derived from specific instances. This approach extends the nearest neighbor algorithm, which has large storage requirements. We describe how storage requirements can be significantly reduced with, at most, minor sacrifices in learning rate and classification accuracy. While the storagereducing algorithm performs well on several realworld databases, its performance degrades rapidly with the level of attribute noise in training instances. Therefore, we extended it with a significance test to distinguish noisy instances. This extended algorithm's performance degrades gracefully with increasing noise levels and compares favorably with a noisetolerant decision tree algorithm.
Comparison of discrimination methods for the classification of tumors using gene expression data
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2002
"... A reliable and precise classification of tumors is essential for successful diagnosis and treatment of cancer. cDNA microarrays and highdensity oligonucleotide chips are novel biotechnologies increasingly used in cancer research. By allowing the monitoring of expression levels in cells for thousand ..."
Abstract

Cited by 501 (4 self)
 Add to MetaCart
A reliable and precise classification of tumors is essential for successful diagnosis and treatment of cancer. cDNA microarrays and highdensity oligonucleotide chips are novel biotechnologies increasingly used in cancer research. By allowing the monitoring of expression levels in cells for thousands of genes simultaneously, microarray experiments may lead to a more complete understanding of the molecular variations among tumors and hence to a finer and more informative classification. The ability to successfully distinguish between tumor classes (already known or yet to be discovered) using gene expression data is an important aspect of this novel approach to cancer classification. This article compares the performance of different discrimination methods for the classification of tumors based on gene expression data. The methods include nearestneighbor classifiers, linear discriminant analysis, and classification trees. Recent machine learning approaches, such as bagging and boosting, are also considered. The discrimination methods are applied to datasets from three recently published cancer gene expression studies.
BagBoosting for tumor classification with gene expression data
 Bioinformatics
, 2004
"... Motivation: Microarray experiments are expected to contribute significantly to the progress in cancer treatment by enabling a precise and early diagnosis. They create a need for class prediction tools, which can deal with a large number of highly correlated input variables, perform feature selection ..."
Abstract

Cited by 126 (2 self)
 Add to MetaCart
Motivation: Microarray experiments are expected to contribute significantly to the progress in cancer treatment by enabling a precise and early diagnosis. They create a need for class prediction tools, which can deal with a large number of highly correlated input variables, perform feature selection and provide class probability estimates that serve as a quantification of the predictive uncertainty. A very promising solution is to combine the two ensemble schemes bagging and boosting to a novel algorithm called BagBoosting.
Results: When bagging is used as a module in boosting, the resulting classifier consistently improves the predictive performance and the probability estimates of both bagging and boosting on real and simulated gene expression data. This quasiguaranteed improvement can be obtained by simply making a bigger computing effort. The advantageous predictive potential is also confirmed by comparing BagBoosting to several established class prediction tools for microarray data.
Flexible Metric Nearest Neighbor Classification
, 1994
"... The Knearestneighbor decision rule assigns an object of unknown class to the plurality class among the K labeled "training" objects that are closest to it. Closeness is usually defined in terms of a metric distance on the Euclidean space with the input measurement variables as axes. The metric cho ..."
Abstract

Cited by 123 (2 self)
 Add to MetaCart
The Knearestneighbor decision rule assigns an object of unknown class to the plurality class among the K labeled "training" objects that are closest to it. Closeness is usually defined in terms of a metric distance on the Euclidean space with the input measurement variables as axes. The metric chosen to define this distance can strongly effect performance. An optimal choice depends on the problem at hand as characterized by the respective class distributions on the input measurement space, and within a given problem, on the location of the unknown object in that space. In this paper new types of Knearestneighbor procedures are described that estimate the local relevance of each input variable, or their linear combinations, for each individual point to be classified. This information is then used to separately customize the metric used to define distance from that object in finding its nearest neighbors. These procedures are a hybrid between regular Knearestneighbor methods and treestructured recursive partitioning techniques popular in statistics and machine learning.
Mining DistanceBased Outliers in Near Linear Time with Randomization and a Simple Pruning Rule
, 2003
"... Defining outliers by their distance to neighboring examples is a popular approach to finding unusual examples in a data set. Recently, much work has been conducted with the goal of finding fast algorithms for this task. We show that a simple nested loop algorithm that in the worst case is quadratic ..."
Abstract

Cited by 104 (4 self)
 Add to MetaCart
Defining outliers by their distance to neighboring examples is a popular approach to finding unusual examples in a data set. Recently, much work has been conducted with the goal of finding fast algorithms for this task. We show that a simple nested loop algorithm that in the worst case is quadratic can give near linear time performance when the data is in random order and a simple pruning rule is used. We test our algorithm on real highdimensional data sets with millions of examples and show that the near linear scaling holds over several orders of magnitude. Our average case analysis suggests that much of the e#ciency is because the time to process nonoutliers, which are the majority of examples, does not depend on the size of the data set.
Relational InstanceBased Learning
 Proceedings of the Thirteenth International Conference on Machine Learning
, 1996
"... A relational instancebased learning algorithm, called Ribl, is motivated and developed in this paper. We argue that instancebased methods o#er solutions to the often unsatisfactory behavior of current inductive logic programming #ILP# approaches in domains with continuous attribute values a ..."
Abstract

Cited by 66 (1 self)
 Add to MetaCart
A relational instancebased learning algorithm, called Ribl, is motivated and developed in this paper. We argue that instancebased methods o#er solutions to the often unsatisfactory behavior of current inductive logic programming #ILP# approaches in domains with continuous attribute values and in domains with noisy attributes and#or examples. Three research issues that emerge when a propositional instancebased learner is adapted to a #rstorder representation are identi#ed: #1# construction of cases from the knowledge base, #2# computation of similaritybetween arbitrarily complex cases, and #3# estimation of the relevance of predicates and attributes. Solutions to these issues are developed. Empirical results indicate that Ribl is able to achieve high classi#cation accuracy in a variety of domains. to appear in: Proc. 13th International Conference on Machine Learning, L. Saitta #ed.#, Morgan Kaufmann, 1996 1 Introduction The #eld of Inductive Logic Programming ...
The Racing Algorithm: Model Selection for Lazy Learners
 Artificial Intelligence Review
, 1997
"... Given a set of models and some training data, we would like to find the model that best describes the data. Finding the model with the lowest generalization error is a computationally expensive process, especially if the number of testing points is high or if the number of models is large. Optimizat ..."
Abstract

Cited by 50 (3 self)
 Add to MetaCart
Given a set of models and some training data, we would like to find the model that best describes the data. Finding the model with the lowest generalization error is a computationally expensive process, especially if the number of testing points is high or if the number of models is large. Optimization techniques such as hill climbing or genetic algorithms are helpful but can end up with a model that is arbitrarily worse than the best one or cannot be used because there is no distance metric on the space of discrete models. In this paper we develop a technique called "racing" that tests the set of models in parallel, quickly discards those models that are clearly inferior and concentrates the computational effort on differentiating among the better models. Racing is especially suitable for selecting among lazy learners since training requires negligible expense, and incremental testing using leaveoneout cross validation is efficient. We use racing to select among various lazy learnin...
Combining Nearest Neighbor Classifiers Through Multiple Feature Subsets
"... Combining multiple classifiers is an effective technique for improving accuracy. There are many general combining algorithms, such as Bagging or Error Correcting Output Coding, that significantly improve classifiers like decision trees, rule learners, or neural networks. Unfortunately, many combinin ..."
Abstract

Cited by 46 (0 self)
 Add to MetaCart
Combining multiple classifiers is an effective technique for improving accuracy. There are many general combining algorithms, such as Bagging or Error Correcting Output Coding, that significantly improve classifiers like decision trees, rule learners, or neural networks. Unfortunately, many combining methods do not improve the nearest neighbor classifier. In this paper, we present MFS, a combining algorithm designed to improve the accuracy of the nearest neighbor (NN) classifier. MFS combines multiple NN classifiers each using only a random subset of features. The experimental results are encouraging: On 25 datasets from the UCI Repository, MFS significantly improved upon the NN, k nearest neighbor (kNN), and NN classifiers with forward and backward selection of features. MFS was also robust to corruption by irrelevant features compared to the kNN classifier. Finally, we show that MFS is able to reduce both bias and variance components of error.
An Empirical Investigation of Brute Force to choose Features, Smoothers and Function Approximators
 Computational Learning Theory and Natural Learning Systems
, 1992
"... The generalization error of a function approximator, feature set or smoother can be estimated directly by the leaveoneout crossvalidation error. For memorybased methods, this is computationally feasible. We describe an initial version of a general memorybased learning system (GMBL): a large col ..."
Abstract

Cited by 42 (10 self)
 Add to MetaCart
The generalization error of a function approximator, feature set or smoother can be estimated directly by the leaveoneout crossvalidation error. For memorybased methods, this is computationally feasible. We describe an initial version of a general memorybased learning system (GMBL): a large collection of learners brought into a widely applicable machinelearning family. We present ongoing investigations into search algorithms which, given a dataset, find the family members and features that generalize best. We also describe GMBL's application to two noisy, difficult problemspredicting car engine emissions from pressure waves, and controlling a robot billiards player with redundant state variables. 1 Introduction The main engineering benefit of machine learning is its application to autonomous systems in which human decision making is minimized. Function approximation plays a large and successful role in this process. However, many other human decisions are needed even for si...
Distributionfree consistency results in nonparametric discrimination and regression function estimation
 Ann. Statist
, 1980
"... (X,,, Y„) be a random sample drawn from its distribution. We study the consistency properties of the kernel estimate m(x) of the regression function m(x) = E { Y X = x} that is defined by m(x) = ~ i1 Y,k((X, x)/h)/7. n~1k((Xi x)/h,?) where k is a bounded nonnegative function on Rd with compact ..."
Abstract

Cited by 41 (9 self)
 Add to MetaCart
(X,,, Y„) be a random sample drawn from its distribution. We study the consistency properties of the kernel estimate m(x) of the regression function m(x) = E { Y X = x} that is defined by m(x) = ~ i1 Y,k((X, x)/h)/7. n~1k((Xi x)/h,?) where k is a bounded nonnegative function on Rd with compact support and (h,? ) is a sequence of positive numbers satisfying h „~,,0, nh,'n oo. It is shown that E { f I m„ (x) m(x)rµ(dx))~,,0 whenever E(I YAP) < x (p> 1). No other restrictions are placed on the distribution of (X, Y). The result is applied to verify the Bayes risk consistency of the corresponding discrimination rules. 1. Introduction and summary. In this paper we present consistency results for the nonparametric regression function estimation problem. Assume that (X, Y), (X1, Y1),. • • , (Xn, Yn) are independent identically distributed Rd x Rvalued random vectors with E { I Y I} C oo. The purpose is to estimate the regression function m(x) = E{YIX = x}