Results 1  10
of
54
Locally Weighted Learning for Control
, 1996
"... Lazy learning methods provide useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of complex systems. This paper surveys ways in which locally weighted learning, a type of lazy learning, has been applied by us to control tasks. We ex ..."
Abstract

Cited by 197 (19 self)
 Add to MetaCart
(Show Context)
Lazy learning methods provide useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of complex systems. This paper surveys ways in which locally weighted learning, a type of lazy learning, has been applied by us to control tasks. We explain various forms that control tasks can take, and how this affects the choice of learning paradigm. The discussion section explores the interesting impact that explicitly remembering all previous experiences has on the problem of learning to control.
A Machine Learning Architecture for Optimizing Web Search Engines
 In AAAI Workshop on Internetbased Information Systems
, 1996
"... Indexing systems for the World Wide Web, such as Lycos and Alta Vista, play an essential role in making the Web useful and usable. These systems are based on Information Retrieval methods for indexing plain text documents, but also include heuristics for adjusting their document rankings based on th ..."
Abstract

Cited by 77 (10 self)
 Add to MetaCart
(Show Context)
Indexing systems for the World Wide Web, such as Lycos and Alta Vista, play an essential role in making the Web useful and usable. These systems are based on Information Retrieval methods for indexing plain text documents, but also include heuristics for adjusting their document rankings based on the special HTML structure of Web documents. In this paper, we describe a wide range of such heuristicsincluding a novel one inspired by reinforcement learning techniques for propagating rewards through a graphwhich can be used to affect a search engine's rankings. We then demonstrate a system which learns to combine these heuristics automatically, based on feedback collected unintrusively from users, resulting in much improved rankings. 1 Introduction Lycos (Mauldin & Leavitt 1994), Alta Vista, and similar Web search engines have become essential as tools for locating information on the evergrowing World Wide Web. Underlying these systems are statistical methods for indexing plain te...
Learning Evaluation Functions to Improve Optimization by Local Search
 Journal of Machine Learning Research
, 2000
"... This paper describes algorithms that learn to improve search performance on largescale optimization tasks. The main algorithm, Stage, works by learning an evaluation function that predicts the outcome of a local search algorithm, such as hillclimbing or Walksat, from features of states visited durin ..."
Abstract

Cited by 59 (0 self)
 Add to MetaCart
(Show Context)
This paper describes algorithms that learn to improve search performance on largescale optimization tasks. The main algorithm, Stage, works by learning an evaluation function that predicts the outcome of a local search algorithm, such as hillclimbing or Walksat, from features of states visited during search. The learned evaluation function is then used to bias future search trajectories toward better optima on the same problem. Another algorithm, XStage, transfers previously learned evaluation functions to new, similar optimization problems. Empirical results are provided on seven largescale optimization domains: binpacking, channel routing, Bayesian network structurefinding, radiotherapy treatment planning, cartogram design, Boolean satisfiability, and Boggle board setup.
Active policy learning for robot planning and exploration under uncertainty
 IN PROCEEDINGS OF ROBOTICS: SCIENCE AND SYSTEMS
, 2007
"... This paper proposes a simulationbased active policy learning algorithm for finitehorizon, partiallyobserved sequential decision processes. The algorithm is tested in the domain of robot navigation and exploration under uncertainty. In such a setting, the expected cost, that must be minimized, is ..."
Abstract

Cited by 39 (5 self)
 Add to MetaCart
(Show Context)
This paper proposes a simulationbased active policy learning algorithm for finitehorizon, partiallyobserved sequential decision processes. The algorithm is tested in the domain of robot navigation and exploration under uncertainty. In such a setting, the expected cost, that must be minimized, is a function of the belief state (filtering distribution). This filtering distribution is in turn nonlinear and subject to discontinuities, which arise because constraints in the robot motion and control models. As a result, the expected cost is nondifferentiable and very expensive to simulate. The new algorithm overcomes the first difficulty and reduces the number of required simulations as follows. First, it assumes that we have carried out previous simulations which returned values of the expected cost for different corresponding policy parameters. Second, it fits a Gaussian process (GP) regression model to these values, so as to approximate the expected cost as a function of the policy parameters. Third, it uses the GP predicted mean and variance to construct a statistical measure that determines which policy parameters should be used in the next simulation. The process is then repeated using the new parameters and the newly gathered expected cost observation. Since the objective is to find the policy parameters that minimize the expected cost, this iterative active learning approach effectively tradesoff between exploration (in regions where the GP variance is large) and exploitation (where the GP mean is low). In our experiments, a robot uses the proposed algorithm to plan an optimal path for accomplishing a series of tasks, while maximizing the information about its pose and map estimates. These estimates are obtained with a standard filter for simultaneous localization and mapping. Upon gathering new observations, the robot updates the state estimates and is able to replan a new path in the spirit of openloop feedback control.
Q2: Memorybased active learning for optimizing noisy continuous functions
, 1998
"... This paper introduces a new algorithm, Q2, for optimizing the expected output of a multiinput noisy continuous function. Q2 is designed to need only a few experiments, it avoids strong assumptions on the form of the function, and it is autonomous in that it requires little problemspecific tweaking. ..."
Abstract

Cited by 24 (3 self)
 Add to MetaCart
This paper introduces a new algorithm, Q2, for optimizing the expected output of a multiinput noisy continuous function. Q2 is designed to need only a few experiments, it avoids strong assumptions on the form of the function, and it is autonomous in that it requires little problemspecific tweaking. These capabilities are directly applicable to industrial processes, and may become increasingly valuable elsewhere as the machine learning field expands beyond prediction and function identification, and into embedded active learning subsystems in robots, vehicles and consumer products. Four existing approaches to this problem (response surface methods, numerical optimization, supervised learning, and evolutionary methods) all have inadequacies when the requirement of "black box" behavior is combined with the need for few experiments. Q2 uses instancebased determination of a convex region of interest for performing experiments. In conventional instancebased approaches to learning, a neigh...
A Nonparametric Approach to Noisy and Costly Optimization
, 2000
"... This paper describes PAIRWISE BISECTION: a nonparametric approach to optimizing a noisy function with few function evaluations. ..."
Abstract

Cited by 23 (3 self)
 Add to MetaCart
This paper describes PAIRWISE BISECTION: a nonparametric approach to optimizing a noisy function with few function evaluations.
Batch Bayesian Optimization via Simulation Matching
"... Bayesian optimization methods are often used to optimize unknown functions that are costly to evaluate. Typically, these methods sequentially select inputs to be evaluated one at a time based on a posterior over the unknown function that is updated after each evaluation. In many applications, howeve ..."
Abstract

Cited by 14 (6 self)
 Add to MetaCart
(Show Context)
Bayesian optimization methods are often used to optimize unknown functions that are costly to evaluate. Typically, these methods sequentially select inputs to be evaluated one at a time based on a posterior over the unknown function that is updated after each evaluation. In many applications, however, it is desirable to perform multiple evaluations in parallel, which requires selecting batches of multiple inputs to evaluate at once. In this paper, we propose a novel approach to batch Bayesian optimization, providing a policy for selecting batches of inputs with the goal of optimizing the function as efficiently as possible. The key idea is to exploit the availability of highquality and efficient sequential policies, by using MonteCarlo simulation to select input batches that closely match their expected behavior. Our experimental results on six benchmarks show that the proposed approach significantly outperforms two baselines and can lead to large advantages over a top sequential approach in terms of performance per unit time. 1
Active Learning For Identifying Function Threshold Boundaries
"... We present an efficient algorithm to actively select queries for learning the boundaries separating a function domain into regions where the function is above and below a given threshold. We develop experiment selection methods based on entropy, misclassification rates, variance, and their combi ..."
Abstract

Cited by 13 (5 self)
 Add to MetaCart
We present an efficient algorithm to actively select queries for learning the boundaries separating a function domain into regions where the function is above and below a given threshold. We develop experiment selection methods based on entropy, misclassification rates, variance, and their combinations, and show how they perform on a number of data sets. We then show how these algorithms are used to determine simultaneously valid 1  # confidence intervals for seven cosmological parameters. Experimentation shows that the algorithm reduces the computation necessary for the parameter estimation problem by an order of magnitude.
Reactive search: machine learning for memorybased heuristics
 Teofilo F. Gonzalez (Ed.), Approximation Algorithms and Metaheuristics, Taylor & Francis Books (CRC Press
, 2005
"... 1 Introduction: the role of the user in heuristics Most stateoftheart heuristics are characterized by a certain number of choices and free parameters, whose appropriate setting is a subject that raises issues of research methodology [5, 41, 51]. In some cases, these parameters are tuned through a ..."
Abstract

Cited by 13 (5 self)
 Add to MetaCart
(Show Context)
1 Introduction: the role of the user in heuristics Most stateoftheart heuristics are characterized by a certain number of choices and free parameters, whose appropriate setting is a subject that raises issues of research methodology [5, 41, 51]. In some cases, these parameters are tuned through a feedback loop that includes the user as a crucial learning component: depending on preliminary algorithm tests some parameter values are changed by the
Using Response Surfaces and Expected Improvement to Optimize Snake Robot Gait Parameters
"... Abstract — Several categories of optimization problems suffer from expensive objective function evaluation, driving the need for smart selection of subsequent experiments. One such category of problems involves physical robotic systems, which often require significant time, effort, and monetary expe ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
(Show Context)
Abstract — Several categories of optimization problems suffer from expensive objective function evaluation, driving the need for smart selection of subsequent experiments. One such category of problems involves physical robotic systems, which often require significant time, effort, and monetary expenditure in order to run tests. To assist in the selection of the next experiment, there has been a focus on the idea of response surfaces in recent years. These surfaces interpolate the existing data and provide a measure of confidence in their error, serving as a lowfidelity surrogate function that can be used to more intelligently choose the next experiment. In this paper, we robustly implement a previous algorithm based on the response surface methodology with an expected improvement criteria. We apply this technique to optimize openloop gait parameters for snake robots, and demonstrate improved locomotive capabilities. I.