## Using Prediction to Improve Combinatorial Optimization Search (1997)

Venue: | In Proc. of 6th Int'l Workshop on Artificial Intelligence and Statistics |

Citations: | 20 - 1 self |

### BibTeX

@INPROCEEDINGS{Boyan97usingprediction,

author = {Justin A. Boyan and Andrew W. Moore},

title = {Using Prediction to Improve Combinatorial Optimization Search},

booktitle = {In Proc. of 6th Int'l Workshop on Artificial Intelligence and Statistics},

year = {1997}

}

### Years of Citing Articles

### OpenURL

### Abstract

To appear in AISTATS-97 This paper describes a statistical approach to improving the performance of stochastic search algorithms for optimization. Given a search algorithm A, we learn to predict the outcome of A as a function of state features along a search trajectory. Predictions are made by a function approximator such as global or locally-weighted polynomial regression; training data is collected by Monte-Carlo simulation. Extrapolating from this data produces a new evaluation function which can bias future search trajectories toward better optima. Our implementation of this idea, STAGE, has produced very promising results on two large-scale domains. 1 Introduction The problem of combinatorial optimization is simply stated: given a finite state space X and an objective function f : X ! !, find an optimal state x = argmin x2X f(x). Typically, X is huge, and finding an optimal x is intractable. However, there are many heuristic algorithms that attempt to exploit f 's structur...

### Citations

3007 |
Dynamic Programming
- Bellman
- 1957
(Show Context)
Citation Context ...y for exploring a Markov Decision Process, and VA is the value function of that policy: it predicts the eventual expected outcome from every state. Since value functions satisfy the Bellman equations =-=[1]-=-, algorithms more sophisticated than Monte-Carlo simulation with supervised learning are applicable: in particular, the TD() family of temporal-difference algorithms may make better use of training da... |

1350 | Learning to predict by the methods of temporal differences
- Sutton
- 1988
(Show Context)
Citation Context ...sticated than Monte-Carlo simulation with supervised learning are applicable: in particular, the TD() family of temporal-difference algorithms may make better use of training data and converge faster =-=[6]-=-. However, our experiments reported in this paper use only supervised learning. 3 2.2 Using the Predictions The learned function VA is designed to predict which states make good starting places for se... |

419 |
Simulated Annealing: Theory and Applications
- Laarhoven, Aarts
- 1987
(Show Context)
Citation Context ...for which f(y) ! f(x), regardless of the past history of the search; local minima are terminal states of the chain. Simulated annealing is Markovian in the expanded state space X \Theta ftemperatureg =-=[7]-=-. Thus, from a search trajectory of length 100, we would obtain not one but 100 training samples for VA . 2 The state space X is huge, so we cannot expect our simulations to explore any significant fr... |

123 |
S.: A new adaptive multi-start technique for combinatorial global optimizations
- Boese, Kahng, et al.
- 1994
(Show Context)
Citation Context ...wn is that for two large-scale problems, with very simple choices of features, operators, and models, a useful structure can be identified and exploited. A very relevant investigation by Boese et. al.=-=[2] give-=-s further reasons for optimism. They studied the set of local minima reached by independent runs of hillclimbing on a traveling salesman problem and a graph bisection problem. They found a "big v... |

60 |
Simulated Annealing for VLSI Design
- Wong, Leong, et al.
- 1988
(Show Context)
Citation Context ...uraging preliminary results in the domains of channel routing and map layout. 4 3 Results 3.1 Channel Routing The problem of "Manhattan channel routing" is an important subtask of VLSI circu=-=it design [8]-=-. Given two rows of labelled pins across a rectangular channel, we must connect like-labelled pins to one another by placing wire segments into vertical and horizontal tracks (see Figure 2). Segments ... |

18 |
Reinforcement Learning for Job-Shop Scheduling
- Zhang
- 1996
(Show Context)
Citation Context ...e learning to do the same. Zhang and Dietterich have explored another way to use learning to improve combinatorial optimization: they learn a search strategy from scratch using online value iteration =-=[9]-=-. By contrast, STAGE begins with an alreadygiven search strategy and uses prediction to learn to improve on it. Zhang and Dietterich reported success in transferring learned search control knowledge f... |

8 | Synthesis of High-Performance Analog Cells in ASTRX/OBLX - Ochotta - 1994 |

6 | An efficient lower bound algorithm for channel routing. Integration: The VLSI Journal
- Chao, Harper
- 1996
(Show Context)
Citation Context ...nd the extent to which it occurs is largely an empirical question. To investigate the potential for transfer, we re-ran experiment (C) on a suite of eight problems from the channel routing literature =-=[3]-=-. Table 2 summarizes the results and gives the coefficients of the linear evaluation function learned (independently) for each problem. To make the similarities easier to see in the table, we have nor... |