Results 1 - 10
of
25
Locally Weighted Learning for Control
, 1996
"... Lazy learning methods provide useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of complex systems. This paper surveys ways in which locally weighted learning, a type of lazy learning, has been applied by us to control tasks. We ex ..."
Abstract
-
Cited by 137 (17 self)
- Add to MetaCart
Lazy learning methods provide useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of complex systems. This paper surveys ways in which locally weighted learning, a type of lazy learning, has been applied by us to control tasks. We explain various forms that control tasks can take, and how this affects the choice of learning paradigm. The discussion section explores the interesting impact that explicitly remembering all previous experiences has on the problem of learning to control.
A Machine Learning Architecture for Optimizing Web Search Engines
- In AAAI Workshop on Internet-based Information Systems
, 1996
"... Indexing systems for the World Wide Web, such as Lycos and Alta Vista, play an essential role in making the Web useful and usable. These systems are based on Information Retrieval methods for indexing plain text documents, but also include heuristics for adjusting their document rankings based on th ..."
Abstract
-
Cited by 63 (9 self)
- Add to MetaCart
Indexing systems for the World Wide Web, such as Lycos and Alta Vista, play an essential role in making the Web useful and usable. These systems are based on Information Retrieval methods for indexing plain text documents, but also include heuristics for adjusting their document rankings based on the special HTML structure of Web documents. In this paper, we describe a wide range of such heuristics---including a novel one inspired by reinforcement learning techniques for propagating rewards through a graph---which can be used to affect a search engine's rankings. We then demonstrate a system which learns to combine these heuristics automatically, based on feedback collected unintrusively from users, resulting in much improved rankings. 1 Introduction Lycos (Mauldin & Leavitt 1994), Alta Vista, and similar Web search engines have become essential as tools for locating information on the ever-growing World Wide Web. Underlying these systems are statistical methods for indexing plain te...
Learning Evaluation Functions to Improve Optimization by Local Search
- Journal of Machine Learning Research
, 2000
"... This paper describes algorithms that learn to improve search performance on largescale optimization tasks. The main algorithm, Stage, works by learning an evaluation function that predicts the outcome of a local search algorithm, such as hillclimbing or Walksat, from features of states visited durin ..."
Abstract
-
Cited by 49 (0 self)
- Add to MetaCart
This paper describes algorithms that learn to improve search performance on largescale optimization tasks. The main algorithm, Stage, works by learning an evaluation function that predicts the outcome of a local search algorithm, such as hillclimbing or Walksat, from features of states visited during search. The learned evaluation function is then used to bias future search trajectories toward better optima on the same problem. Another algorithm, X-Stage, transfers previously learned evaluation functions to new, similar optimization problems. Empirical results are provided on seven large-scale optimization domains: bin-packing, channel routing, Bayesian network structure-finding, radiotherapy treatment planning, cartogram design, Boolean satisfiability, and Boggle board setup.
Q2: Memory-based active learning for optimizing noisy continuous functions
, 1998
"... This paper introduces a new algorithm, Q2, for optimizing the expected output of a multiinput noisy continuous function. Q2 is designed to need only a few experiments, it avoids strong assumptions on the form of the function, and it is autonomous in that it requires little problem-specific tweaking. ..."
Abstract
-
Cited by 16 (3 self)
- Add to MetaCart
This paper introduces a new algorithm, Q2, for optimizing the expected output of a multiinput noisy continuous function. Q2 is designed to need only a few experiments, it avoids strong assumptions on the form of the function, and it is autonomous in that it requires little problem-specific tweaking. These capabilities are directly applicable to industrial processes, and may become increasingly valuable elsewhere as the machine learning field expands beyond prediction and function identification, and into embedded active learning subsystems in robots, vehicles and consumer products. Four existing approaches to this problem (response surface methods, numerical optimization, supervised learning, and evolutionary methods) all have inadequacies when the requirement of "black box" behavior is combined with the need for few experiments. Q2 uses instance-based determination of a convex region of interest for performing experiments. In conventional instance-based approaches to learning, a neigh...
Reactive search: machine learning for memory-based heuristics
- Teofilo F. Gonzalez (Ed.), Approximation Algorithms and Metaheuristics, Taylor & Francis Books (CRC Press
, 2005
"... 1 Introduction: the role of the user in heuristics Most state-of-the-art heuristics are characterized by a certain number of choices and free parameters, whose appropriate setting is a subject that raises issues of research methodology [5, 41, 51]. In some cases, these parameters are tuned through a ..."
Abstract
-
Cited by 13 (5 self)
- Add to MetaCart
1 Introduction: the role of the user in heuristics Most state-of-the-art heuristics are characterized by a certain number of choices and free parameters, whose appropriate setting is a subject that raises issues of research methodology [5, 41, 51]. In some cases, these parameters are tuned through a feedback loop that includes the user as a crucial learning component: depending on preliminary algorithm tests some parameter values are changed by the
A Nonparametric Approach to Noisy and Costly Optimization
, 2000
"... This paper describes PAIRWISE BISECTION: a nonparametric approach to optimizing a noisy function with few function evaluations. ..."
Abstract
-
Cited by 12 (3 self)
- Add to MetaCart
This paper describes PAIRWISE BISECTION: a nonparametric approach to optimizing a noisy function with few function evaluations.
Active policy learning for robot planning and exploration under uncertainty
- IN PROCEEDINGS OF ROBOTICS: SCIENCE AND SYSTEMS
, 2007
"... This paper proposes a simulation-based active policy learning algorithm for finite-horizon, partially-observed sequential decision processes. The algorithm is tested in the domain of robot navigation and exploration under uncertainty. In such a setting, the expected cost, that must be minimized, is ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
This paper proposes a simulation-based active policy learning algorithm for finite-horizon, partially-observed sequential decision processes. The algorithm is tested in the domain of robot navigation and exploration under uncertainty. In such a setting, the expected cost, that must be minimized, is a function of the belief state (filtering distribution). This filtering distribution is in turn nonlinear and subject to discontinuities, which arise because constraints in the robot motion and control models. As a result, the expected cost is non-differentiable and very expensive to simulate. The new algorithm overcomes the first difficulty and reduces the number of required simulations as follows. First, it assumes that we have carried out previous simulations which returned values of the expected cost for different corresponding policy parameters. Second, it fits a Gaussian process (GP) regression model to these values, so as to approximate the expected cost as a function of the policy parameters. Third, it uses the GP predicted mean and variance to construct a statistical measure that determines which policy parameters should be used in the next simulation. The process is then repeated using the new parameters and the newly gathered expected cost observation. Since the objective is to find the policy parameters that minimize the expected cost, this iterative active learning approach effectively trades-off between exploration (in regions where the GP variance is large) and exploitation (where the GP mean is low). In our experiments, a robot uses the proposed algorithm to plan an optimal path for accomplishing a series of tasks, while maximizing the information about its pose and map estimates. These estimates are obtained with a standard filter for simultaneous localization and mapping. Upon gathering new observations, the robot updates the state estimates and is able to replan a new path in the spirit of open-loop feedback control.
Active Learning For Identifying Function Threshold Boundaries
"... We present an efficient algorithm to actively select queries for learning the boundaries separating a function domain into regions where the function is above and below a given threshold. We develop experiment selection methods based on entropy, misclassification rates, variance, and their combi ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
We present an efficient algorithm to actively select queries for learning the boundaries separating a function domain into regions where the function is above and below a given threshold. We develop experiment selection methods based on entropy, misclassification rates, variance, and their combinations, and show how they perform on a number of data sets. We then show how these algorithms are used to determine simultaneously valid 1 - # confidence intervals for seven cosmological parameters. Experimentation shows that the algorithm reduces the computation necessary for the parameter estimation problem by an order of magnitude.
A memory-based rash optimizer
- IN AAAI-06 WORKSHOP ON HEURISTIC SEARCH, MEMORY BASED HEURISTICS AND THEIR APPLICATIONS
, 2006
"... This paper presents a memory-based Reactive Affine Shaker (M-RASH) algorithm for global optimization. The Reactive Affine Shaker is an adaptive search algorithm based only on the function values. M-RASH is an extension of RASH in which good starting points to RASH are suggested online by using Bayes ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
This paper presents a memory-based Reactive Affine Shaker (M-RASH) algorithm for global optimization. The Reactive Affine Shaker is an adaptive search algorithm based only on the function values. M-RASH is an extension of RASH in which good starting points to RASH are suggested online by using Bayesian Locally Weighted Regression (B-LWR). Both techniques use the memory about the previous history of the search to guide the future exploration but in very different ways. RASH compiles the previous experience into a local search area where sample points are drawn, while locally-weighted regression saves the entire previous history to be mined extensively when an additional sample point is generated. Because of the high computational cost related to the B-LWR model, it is applied only to evaluate the potential of an initial point for a local search run. The experimental results, focussed onto the case when the dominant computational cost is the evaluation of the target f function, show that M-RASH is indeed capable of leading to good results for a smaller number of function evaluations.
Learning Evaluation Functions
- CMU CS Thesis Proposal
, 1996
"... Evaluation functions are an essential component of practical search algorithms for optimization, planning and control. Examples of such algorithms include hillclimbing, simulated annealing, best-first search, A*, and alpha-beta. In all of these, the evaluation functions are typically built manually ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Evaluation functions are an essential component of practical search algorithms for optimization, planning and control. Examples of such algorithms include hillclimbing, simulated annealing, best-first search, A*, and alpha-beta. In all of these, the evaluation functions are typically built manually by domain experts, and may require considerable tweaking to work well. I will investigate the thesis that statistical machine learning can be used to automatically generate high-quality evaluation functions for practical combinatorial problems. The data for such learning is gathered by running trajectories through the search space. The learned evaluation function may be applied either to guide further exploration of the same space, or to improve performance in new problem spaces which share similar features. Two general families of learning algorithms apply here: reinforcement learning and meta-optimization. The reinforcement learning approach, dating back to Samuel's checkers player [ 1959 ...

