Results 1  10
of
47
Locally weighted learning
 ARTIFICIAL INTELLIGENCE REVIEW
, 1997
"... This paper surveys locally weighted learning, a form of lazy learning and memorybased learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, ass ..."
Abstract

Cited by 448 (52 self)
 Add to MetaCart
This paper surveys locally weighted learning, a form of lazy learning and memorybased learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, assessing predictions, handling noisy data and outliers, improving the quality of predictions by tuning t parameters, interference between old and new data, implementing locally weighted learning e ciently, and applications of locally weighted learning. A companion paper surveys how locally weighted learning can be used in robot learning and control.
Xmeans: Extending Kmeans with Efficient Estimation of the Number of Clusters
 In Proceedings of the 17th International Conf. on Machine Learning
, 2000
"... Despite its popularity for general clustering, Kmeans suffers three major shortcomings; it scales poorly computationally, the number of clusters K has to be supplied by the user, and the search is prone to local minima. We propose solutions for the first two problems, and a partial remedy for the t ..."
Abstract

Cited by 267 (5 self)
 Add to MetaCart
Despite its popularity for general clustering, Kmeans suffers three major shortcomings; it scales poorly computationally, the number of clusters K has to be supplied by the user, and the search is prone to local minima. We propose solutions for the first two problems, and a partial remedy for the third. Building on prior work for algorithmic acceleration that is not based on approximation, we introduce a new algorithm that efficiently, searches the space of cluster locations and number of clusters to optimize the Bayesian Information Criterion (BIC) or the Akaike Information Criterion (AIC) measure. The innovations include two new ways of exploiting cached sufficient statistics and a new very efficient test that in one Kmeans sweep selects the most promising subset of classes for refinement. This gives rise to a fast, statistically founded algorithm that outputs both the number of classes and their parameters. Experiments show this technique reveals the true number of classes in the underlying distribution, and that it is much faster than repeatedly using accelerated Kmeans for different values of K.
Improved heterogeneous distance functions
 Journal of Artificial Intelligence Research
, 1997
"... Instancebased learning techniques typically handle continuous and linear input values well, but often do not handle nominal input attributes appropriately. The Value Difference Metric (VDM) was designed to find reasonable distance values between nominal attribute values, but it largely ignores cont ..."
Abstract

Cited by 199 (10 self)
 Add to MetaCart
Instancebased learning techniques typically handle continuous and linear input values well, but often do not handle nominal input attributes appropriately. The Value Difference Metric (VDM) was designed to find reasonable distance values between nominal attribute values, but it largely ignores continuous attributes, requiring discretization to map continuous values into nominal values. This paper proposes three new heterogeneous distance functions, called the Heterogeneous Value Difference Metric (HVDM), the Interpolated Value Difference Metric (IVDM), and the Windowed Value Difference Metric (WVDM). These new distance functions are designed to handle applications with nominal attributes, continuous attributes, or both. In experiments on 48 applications the new distance metrics achieve higher classification accuracy on average than three previous distance functions on those datasets that have both nominal and continuous attributes.
Locally Weighted Learning for Control
, 1996
"... Lazy learning methods provide useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of complex systems. This paper surveys ways in which locally weighted learning, a type of lazy learning, has been applied by us to control tasks. We ex ..."
Abstract

Cited by 159 (17 self)
 Add to MetaCart
Lazy learning methods provide useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of complex systems. This paper surveys ways in which locally weighted learning, a type of lazy learning, has been applied by us to control tasks. We explain various forms that control tasks can take, and how this affects the choice of learning paradigm. The discussion section explores the interesting impact that explicitly remembering all previous experiences has on the problem of learning to control.
Using the Triangle Inequality to Accelerate kMeans
, 2003
"... The kmeans algorithm is by far the most widely used method for discovering clusters in data. We show how to accelerate it dramatically, while still always computing exactly the same result as the standard algorithm. The accelerated algorithm avoids unnecessary distance calculations by applying the ..."
Abstract

Cited by 98 (1 self)
 Add to MetaCart
The kmeans algorithm is by far the most widely used method for discovering clusters in data. We show how to accelerate it dramatically, while still always computing exactly the same result as the standard algorithm. The accelerated algorithm avoids unnecessary distance calculations by applying the triangle inequality in two different ways, and by keeping track of lower and upper bounds for distances between points and centers. Experiments show that the new algorithm is effective for datasets with up to 1000 dimensions, and becomes more and more effective as the number k of clusters increases. For k>=20 it is many times faster than the best previously known accelerated kmeans method.
IGTree: Using Trees for Compression and Classification in Lazy Learning Algorithms
, 1997
"... We describe the IGTree learning algorithm, which compresses an instance base into a tree structure. The concept of information gain is used as a heuristic function for performing this compression. IGTree produces trees that, compared to other lazy learning approaches, reduce storage requirements and ..."
Abstract

Cited by 95 (54 self)
 Add to MetaCart
We describe the IGTree learning algorithm, which compresses an instance base into a tree structure. The concept of information gain is used as a heuristic function for performing this compression. IGTree produces trees that, compared to other lazy learning approaches, reduce storage requirements and the time required to compute classifications. Furthermore, we obtained similar or better generalization accuracy with IGTree when trained on two complex linguistic tasks, viz. letterphoneme transliteration and partofspeechtagging, when compared to alternative lazy learning and decision tree approaches (viz., IB1, informationgainweighted IB1, and C4.5). A third experiment, with the task of word hyphenation, demonstrates that when the mutual differences in information gain of features is too small, IGTree as well as informationgainweighted IB1 perform worse than IB1. These results indicate that IGTree is a useful algorithm for problems characterized by the availability of a large num...
`NBody' Problems in Statistical Learning
, 2001
"... We present efficient algorithms for allpointpairs problems, or 'Nbody 'like problems, which are ubiquitous in statistical learning. We focus on six examples, including nearestneighbor classification, kernel density estimation, outlier detection, and the twopoint correlation. ..."
Abstract

Cited by 90 (12 self)
 Add to MetaCart
We present efficient algorithms for allpointpairs problems, or 'Nbody 'like problems, which are ubiquitous in statistical learning. We focus on six examples, including nearestneighbor classification, kernel density estimation, outlier detection, and the twopoint correlation.
Very Fast EMbased Mixture Model Clustering Using Multiresolution kdtrees
 In Advances in Neural Information Processing Systems 11
, 1998
"... Clustering is importantinmany fields including manufacturing, biology, finance, and astronomy. Mixture models are a popular approach due to their statistical foundations, and EM is a very popular method for finding mixture models. EM, however, requires many accesses of the data, and thus has bee ..."
Abstract

Cited by 89 (4 self)
 Add to MetaCart
Clustering is importantinmany fields including manufacturing, biology, finance, and astronomy. Mixture models are a popular approach due to their statistical foundations, and EM is a very popular method for finding mixture models. EM, however, requires many accesses of the data, and thus has been dismissed as impractical (e.g. (Zhang, Ramakrishnan, & Livny, 1996)) for data mining of enormous datasets.
Efficient Locally Weighted Polynomial Regression Predictions
 In Proceedings of the 1997 International Machine Learning Conference
"... Locally weighted polynomial regression (LWPR) is a popular instancebased algorithm for learning continuous nonlinear mappings. For more than two or three inputs and for more than a few thousand datapoints the computational expense of predictions is daunting. We discuss drawbacks with previous appr ..."
Abstract

Cited by 79 (11 self)
 Add to MetaCart
Locally weighted polynomial regression (LWPR) is a popular instancebased algorithm for learning continuous nonlinear mappings. For more than two or three inputs and for more than a few thousand datapoints the computational expense of predictions is daunting. We discuss drawbacks with previous approaches to dealing with this problem, and present a new algorithm based on a multiresolution search of a quicklyconstructible augmented kdtree. Without needing to rebuild the tree, we can make fast predictions with arbitrary local weighting functions, arbitrary kernel widths and arbitrary queries. The paper begins with a new, faster, algorithm for exact LWPR predictions. Next we introduce an approximation that achieves up to a twoordersof magnitude speedup with negligible accuracy losses. Increasing a certain approximation parameter achieves greater speedups still, but with a correspondingly larger accuracy degradation. This is nevertheless useful during operations such as the early stages...