Results 1 
9 of
9
Learning Evaluation Functions to Improve Optimization by Local Search
 Journal of Machine Learning Research
, 2000
"... This paper describes algorithms that learn to improve search performance on largescale optimization tasks. The main algorithm, Stage, works by learning an evaluation function that predicts the outcome of a local search algorithm, such as hillclimbing or Walksat, from features of states visited durin ..."
Abstract

Cited by 56 (0 self)
 Add to MetaCart
This paper describes algorithms that learn to improve search performance on largescale optimization tasks. The main algorithm, Stage, works by learning an evaluation function that predicts the outcome of a local search algorithm, such as hillclimbing or Walksat, from features of states visited during search. The learned evaluation function is then used to bias future search trajectories toward better optima on the same problem. Another algorithm, XStage, transfers previously learned evaluation functions to new, similar optimization problems. Empirical results are provided on seven largescale optimization domains: binpacking, channel routing, Bayesian network structurefinding, radiotherapy treatment planning, cartogram design, Boolean satisfiability, and Boggle board setup.
Hierarchical Learning with Procedural Abstraction Mechanisms
, 1997
"... Evolutionary computation (EC) consists of the design and analysis of probabilistic algorithms inspired by the principles of natural selection and variation. Genetic Programming (GP) is one subfield of EC that emphasizes desirable features such as the use of procedural representations, the capability ..."
Abstract

Cited by 33 (2 self)
 Add to MetaCart
Evolutionary computation (EC) consists of the design and analysis of probabilistic algorithms inspired by the principles of natural selection and variation. Genetic Programming (GP) is one subfield of EC that emphasizes desirable features such as the use of procedural representations, the capability to discover and exploit intrinsic characteristics of the application domain, and the flexibility to adapt the shape and complexity of learned models. Approaches that learn monolithic representations are considerably less likely to be effective for complex problems, and standard GP is no exception. The main goal of this dissertation is to extend GP capabilities with automatic mechanisms to cope with problems of increasing complexity. Humans succeed here by skillfully using hierarchical decomposition and abstraction mechanisms. The translation of such mechanisms into a general computer implementation is a tremendous challenge, which requires a firm understanding of the interplay between repr...
Using Prediction to Improve Combinatorial Optimization Search
 In Proc. of 6th Int'l Workshop on Artificial Intelligence and Statistics
, 1997
"... To appear in AISTATS97 This paper describes a statistical approach to improving the performance of stochastic search algorithms for optimization. Given a search algorithm A, we learn to predict the outcome of A as a function of state features along a search trajectory. Predictions are made by a fun ..."
Abstract

Cited by 20 (1 self)
 Add to MetaCart
To appear in AISTATS97 This paper describes a statistical approach to improving the performance of stochastic search algorithms for optimization. Given a search algorithm A, we learn to predict the outcome of A as a function of state features along a search trajectory. Predictions are made by a function approximator such as global or locallyweighted polynomial regression; training data is collected by MonteCarlo simulation. Extrapolating from this data produces a new evaluation function which can bias future search trajectories toward better optima. Our implementation of this idea, STAGE, has produced very promising results on two largescale domains. 1 Introduction The problem of combinatorial optimization is simply stated: given a finite state space X and an objective function f : X ! !, find an optimal state x = argmin x2X f(x). Typically, X is huge, and finding an optimal x is intractable. However, there are many heuristic algorithms that attempt to exploit f 's structur...
Solving Combinatorial Optimization Tasks by Reinforcement Learning: A General Methodology Applied to ResourceConstrained Scheduling
 Journal of Artificial Intelligence Reseach
, 1998
"... This paper introduces a methodology for solving combinatorial optimization problems through the application of reinforcement learning methods. The approach can be applied in cases where several similar instances of a combinatorial optimization problem must be solved. The key idea is to analyze a set ..."
Abstract

Cited by 16 (0 self)
 Add to MetaCart
This paper introduces a methodology for solving combinatorial optimization problems through the application of reinforcement learning methods. The approach can be applied in cases where several similar instances of a combinatorial optimization problem must be solved. The key idea is to analyze a set of "training" problem instances and learn a search control policy for solving new problem instances. The search control policy has the twin goals of finding highquality solutions and finding them quickly. Results of applying this methodology to a NASA scheduling problem show that the learned search control policy is much more effective than the best known nonlearning search procedurea method based on simulated annealing. 1. Introduction Combinatorial optimization problems such as the traveling salesperson problem (TSP) and the resourceconstrained scheduling problem (RCSP) are hard to solve optimally. All interesting formulations of these problems are NPHard (Garey & Johnson, 1979; G...
Efficient Value Function Approximation Using Regression Trees
 In Proceedings of the IJCAI Workshop on Statistical Machine Learning for LargeScale Optimization
, 1999
"... Value function approximation is a problem central to reinforcement learning. Many applications of reinforcement learning have relied on neural network function approximators, which are very slow to train and require substantial parameter tweaking to obtain good performance. Other reinforcement learn ..."
Abstract

Cited by 16 (0 self)
 Add to MetaCart
Value function approximation is a problem central to reinforcement learning. Many applications of reinforcement learning have relied on neural network function approximators, which are very slow to train and require substantial parameter tweaking to obtain good performance. Other reinforcement learning studies have applied nearest neighbor and CMAC function approximators, but these cannot scale to problems with many features, especially if some features are irrelevant. We describe initial work on a new function approximation method that uses regression trees to represent value functions. A novel aspect of our method is its error criterion, which combines three terms: the supervised training error, a Bellman error term, and an advantage error term. By using this composite error criterion, we are able to combine many of the benefits of fitted value iteration, TD(0), and advantage updating. The new method is compared experimentally to previous work that employed TD() to solve jobshop sch...
Learning Evaluation Functions
 CMU CS Thesis Proposal
, 1996
"... Evaluation functions are an essential component of practical search algorithms for optimization, planning and control. Examples of such algorithms include hillclimbing, simulated annealing, bestfirst search, A*, and alphabeta. In all of these, the evaluation functions are typically built manually ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Evaluation functions are an essential component of practical search algorithms for optimization, planning and control. Examples of such algorithms include hillclimbing, simulated annealing, bestfirst search, A*, and alphabeta. In all of these, the evaluation functions are typically built manually by domain experts, and may require considerable tweaking to work well. I will investigate the thesis that statistical machine learning can be used to automatically generate highquality evaluation functions for practical combinatorial problems. The data for such learning is gathered by running trajectories through the search space. The learned evaluation function may be applied either to guide further exploration of the same space, or to improve performance in new problem spaces which share similar features. Two general families of learning algorithms apply here: reinforcement learning and metaoptimization. The reinforcement learning approach, dating back to Samuel's checkers player [ 1959 ...
Coordination Of Supply Webs Based On Dispositive Protocols
 In 10th European Conference on Information Systems (ECIS
, 2002
"... The focus of this paper is the design of a mechanism to help economic agents  either autonomously or cooperatively planning  to achieve Paretooptimal allocation of resources via a completely decentralized coordination of a logistics network. Besides giving a classification and a short review of e ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
The focus of this paper is the design of a mechanism to help economic agents  either autonomously or cooperatively planning  to achieve Paretooptimal allocation of resources via a completely decentralized coordination of a logistics network. Besides giving a classification and a short review of existing scheduling approaches capable for supply chain management, this article specifies and evaluates protocols employing timedepended pricefunctions. By performing simulations with the implemented protocol using wellknown Job Shop Scheduling Problems as a benchmark we show the efficiency and feasibility of the designed mechanism. The approach enables each agent to exploit the external effects caused by resource constraints of its supply chain contractors by adapting its production planning. Additionally the systems capability to reconfigure itself in case of production resources failure is increased. The evaluation of the protocols concludes with a welfare analysis investigating the payoff distribution along the supply chain. Finally we conclude that future research on this topic should turn to learning agent systems to reduce communication costs.
A Study on Architecture, Algorithms, and Applications of Approximate Dynamic Programming Based Approach to Optimal Control
, 2004
"... ..."
of The Australian National University.
, 2002
"... Professor Alexander Zelinsky. The work submitted in this thesis is a result of original research carried out by myself, in collaboration with others, while enrolled as a PhD student in the Department of Systems Engineering at the Australian National University. It has not been submitted for any othe ..."
Abstract
 Add to MetaCart
Professor Alexander Zelinsky. The work submitted in this thesis is a result of original research carried out by myself, in collaboration with others, while enrolled as a PhD student in the Department of Systems Engineering at the Australian National University. It has not been submitted for any other degree or award.