## Efficient Locally Weighted Polynomial Regression Predictions (0)

### Cached

### Download Links

- [www.cs.cmu.edu]
- [www.svms.org]
- [www.ri.cmu.edu]
- CiteULike
- DBLP

### Other Repositories/Bibliography

Venue: | In Proceedings of the 1997 International Machine Learning Conference |

Citations: | 81 - 11 self |

### BibTeX

@INPROCEEDINGS{Moore_efficientlocally,

author = {Andrew W. Moore and Jeff Schneider and Kan Deng},

title = {Efficient Locally Weighted Polynomial Regression Predictions},

booktitle = {In Proceedings of the 1997 International Machine Learning Conference},

year = {},

pages = {236--244},

publisher = {Morgan Kaufmann}

}

### Years of Citing Articles

### OpenURL

### Abstract

Locally weighted polynomial regression (LWPR) is a popular instance-based algorithm for learning continuous non-linear mappings. For more than two or three inputs and for more than a few thousand datapoints the computational expense of predictions is daunting. We discuss drawbacks with previous approaches to dealing with this problem, and present a new algorithm based on a multiresolution search of a quicklyconstructible augmented kd-tree. Without needing to rebuild the tree, we can make fast predictions with arbitrary local weighting functions, arbitrary kernel widths and arbitrary queries. The paper begins with a new, faster, algorithm for exact LWPR predictions. Next we introduce an approximation that achieves up to a two-ordersof -magnitude speedup with negligible accuracy losses. Increasing a certain approximation parameter achieves greater speedups still, but with a correspondingly larger accuracy degradation. This is nevertheless useful during operations such as the early stages...

### Citations

3956 |
Classification and regression trees
- Breiman
- 1984
(Show Context)
Citation Context ...nd in turn they have children. This continues recursively until the leaf nodes, which each contain just one point. How do we decide which input attribute to split on and where? Unlike decision trees (=-=Breiman et al., 1984-=-; Quinlan, 1983) for induction, the sole purpose of the splits are to increase computational efficiency---not to alter the inductive bias. We do not believe that the choice between the numerous kd-tre... |

1339 | Generalized Additive Models
- Hastie, Tibshirani
- 1990
(Show Context)
Citation Context ...t dynamics (Moore, 1992; Schaal and Atkeson, 1994) and learning process models. Both classical and Bayesian linear regression analysis tools can be extended to work in the locally weighted framework (=-=Hastie and Tibshirani, 1990-=-), providing confidence intervals on predictions, on gradient estimates and on noise estimates---all important when a learned mapping is to be used by a controller (Atkeson et al., 1997b; Schneider, 1... |

593 | An Algorithm for Finding Best Matches in Logarithmic Expected Time
- Friedman, Bentley, et al.
- 1977
(Show Context)
Citation Context ... analysis, providing noise estimates and confidence intervals along with the prediction. A third solution, and one which does retain information, uses a technique called range-searching with kdtreess(=-=Friedman et al., 1977-=-; Preparata and Shamos, 1985). It is possible to arrange data in such a way that given a query point and a distance, all datapoints within the given distance of the query are returned without needing ... |

453 | Computational Geometry
- Preparata, Shamos
- 1985
(Show Context)
Citation Context ...ise estimates and confidence intervals along with the prediction. A third solution, and one which does retain information, uses a technique called range-searching with kdtreess(Friedman et al., 1977; =-=Preparata and Shamos, 1985-=-). It is possible to arrange data in such a way that given a query point and a distance, all datapoints within the given distance of the query are returned without needing to search the entire dataset... |

310 |
Learning Efficient Classification Procedures and Their Application to Chess end
- Quinlan
- 1983
(Show Context)
Citation Context ...hildren. This continues recursively until the leaf nodes, which each contain just one point. How do we decide which input attribute to split on and where? Unlike decision trees (Breiman et al., 1984; =-=Quinlan, 1983-=-) for induction, the sole purpose of the splits are to increase computational efficiency---not to alter the inductive bias. We do not believe that the choice between the numerous kd-tree splitting cri... |

259 |
Locally weighted regression: An approach to regression analysis by local fitting
- Cleveland, Devlin
- 1988
(Show Context)
Citation Context ...output vectors. It is particularly appropriate for learning complex highly non-linear functions of up to about 30 inputs from noisy data. Popularized in the statistics literature in the past decades (=-=Cleveland and Delvin, 1988-=-; Grosse, 1989; Atkeson et al., 1997a) it is enjoying increasing use in applications such as learning robot dynamics (Moore, 1992; Schaal and Atkeson, 1994) and learning process models. Both classical... |

195 | Classi - cation and regression trees - Breiman, Friedman, et al. - 1984 |

115 | Combining instance-based and model-based learning," presented at
- Quinlan
- 1993
(Show Context)
Citation Context ...th a variable resolution kd-tree structure and multilinear interpolation within tree leaves. Unfortunately, continuous interpolation above two dimensions is very expensive in computation and memory. (=-=Quinlan, 1993-=-) also uses a caching method but ignores continuity by storing separate discontinuous linear maps in the leaves. Another downside to caching solutions is that they only record the fitted surface. The ... |

92 |
Robot Juggling: Implementation of Memory-Based Learning
- Schaal, Atkeson
- 1994
(Show Context)
Citation Context ...statistics literature in the past decades (Cleveland and Delvin, 1988; Grosse, 1989; Atkeson et al., 1997a) it is enjoying increasing use in applications such as learning robot dynamics (Moore, 1992; =-=Schaal and Atkeson, 1994-=-) and learning process models. Both classical and Bayesian linear regression analysis tools can be extended to work in the locally weighted framework (Hastie and Tibshirani, 1990), providing confidenc... |

61 |
Multiresolution instance-based learning
- Deng, Moore
- 1995
(Show Context)
Citation Context ...ning that a significant fraction of the data (sometimes all the data) has non-zero weight. In that case, avoiding the zero-weight datapoints is not much help. In this paper we use the main idea from (=-=Deng and Moore, 1995-=-) in which a multiresolution data structure increased the speed of kernel regression (also known as Locally Weighted Averaging). Here, we extend that method to arbitrary locally weighted polynomials, ... |

46 | Bumptrees for efficient function, constraint, and classification learning
- Omohundro
- 1991
(Show Context)
Citation Context ...e datapoints in a linked list. ffl Instead of always searching the left child first it is advantageous to search the node closest to x query first. This strengthens the W SoFar bound. ffl Ball trees (=-=Omohundro, 1991-=-) play a similar role to a kd-tree used for range searching, but it is possible that a hierarchy of balls, each containing the sufficient statistics of datapoints they contain, could be used beneficia... |

29 |
Fast, robust adaptive control by learning only forward models
- Moore
- 1992
(Show Context)
Citation Context ...rized in the statistics literature in the past decades (Cleveland and Delvin, 1988; Grosse, 1989; Atkeson et al., 1997a) it is enjoying increasing use in applications such as learning robot dynamics (=-=Moore, 1992-=-; Schaal and Atkeson, 1994) and learning process models. Both classical and Bayesian linear regression analysis tools can be extended to work in the locally weighted framework (Hastie and Tibshirani, ... |

18 |
LOESS: Multivariate Smoothing by Moving Least Squares, Approximation Theory VI, Edited by
- Grosse
- 1989
(Show Context)
Citation Context ...ularly appropriate for learning complex highly non-linear functions of up to about 30 inputs from noisy data. Popularized in the statistics literature in the past decades (Cleveland and Delvin, 1988; =-=Grosse, 1989-=-; Atkeson et al., 1997a) it is enjoying increasing use in applications such as learning robot dynamics (Moore, 1992; Schaal and Atkeson, 1994) and learning process models. Both classical and Bayesian ... |

18 | Learning E cient Classi cation Procedures and their Application to Chess End Games - Quinlan - 1983 |

10 | Bumptrees for E cient Function, Constraint, and Classi cation Learning - Omohundro - 1991 |

1 |
Locally Weighted Learning for Control. Accepted for publication
- Atkeson, Moore, et al.
- 1997
(Show Context)
Citation Context ...iate for learning complex highly non-linear functions of up to about 30 inputs from noisy data. Popularized in the statistics literature in the past decades (Cleveland and Delvin, 1988; Grosse, 1989; =-=Atkeson et al., 1997-=-a) it is enjoying increasing use in applications such as learning robot dynamics (Moore, 1992; Schaal and Atkeson, 1994) and learning process models. Both classical and Bayesian linear regression anal... |