Results 1 - 10
of
24
Efficient selectivity and backup operators in Monte-Carlo tree search
- In: Proceedings Computers and Games 2006
, 2006
"... Abstract. Monte-Carlo evaluation consists in estimating a position by averaging the outcome of several random continuations, and can serve as an evaluation function at the leaves of a min-max tree. This paper presents a new framework to combine tree search with Monte-Carlo evaluation, that does not ..."
Abstract
-
Cited by 66 (2 self)
- Add to MetaCart
Abstract. Monte-Carlo evaluation consists in estimating a position by averaging the outcome of several random continuations, and can serve as an evaluation function at the leaves of a min-max tree. This paper presents a new framework to combine tree search with Monte-Carlo evaluation, that does not separate between a min-max phase and a Monte-Carlo phase. Instead of backing-up the min-max value close to the root, and the average value at some depth, a more general backup operator is defined that progressively changes from averaging to min-max as the number of simulations grows. This approach provides a fine-grained control of the tree growth, at the level of individual simulations, and allows efficient selectivity methods. This algorithm was implemented in a 9 × 9 Go-playing program, Crazy Stone, that won the 10th KGS computer-Go tournament. 1
The Knowledge-Gradient Policy for Correlated Normal Beliefs
"... We consider a Bayesian ranking and selection problem with independent normal rewards and a correlated multivariate normal belief on the mean values of these rewards. Because this formulation of the ranking and selection problem models dependence between alternatives’ mean values, algorithms may util ..."
Abstract
-
Cited by 11 (10 self)
- Add to MetaCart
We consider a Bayesian ranking and selection problem with independent normal rewards and a correlated multivariate normal belief on the mean values of these rewards. Because this formulation of the ranking and selection problem models dependence between alternatives’ mean values, algorithms may utilize this dependence to perform efficiently even when the number of alternatives is very large. We propose a fully sequential sampling policy called the knowledge-gradient policy, which is provably optimal in some special cases and has bounded suboptimality in all others. We then demonstrate how this policy may be applied to efficiently maximize a continuous function on a continuous domain while constrained to a fixed number of noisy measurements.
A knowledge-gradient policy for sequential information collection
- SIAM J. on Control and Optimization
"... Abstract. In a sequential Bayesian ranking and selection problem with independent normal populations and common known variance, we study a previously introduced measurement policy which we refer to as the knowledge-gradient policy. This policy myopically maximizes the expected increment in the value ..."
Abstract
-
Cited by 7 (7 self)
- Add to MetaCart
Abstract. In a sequential Bayesian ranking and selection problem with independent normal populations and common known variance, we study a previously introduced measurement policy which we refer to as the knowledge-gradient policy. This policy myopically maximizes the expected increment in the value of information in each time period, where the value is measured according to the terminal utility function. We show that the knowledge-gradient policy is optimal both when the horizon is a single time period and in the limit as the horizon extends to infinity. We show furthermore that, in some special cases, the knowledge-gradient policy is optimal regardless of the length of any given fixed total sampling horizon. We bound the knowledge-gradient policy’s suboptimality in the remaining cases, and show through simulations that it performs competitively with or significantly better than other policies.
Discrete optimization via simulation using COMPASS
- Operations Research
, 2006
"... informs ® doi 10.1287/opre.1050.0237 © 2006 INFORMS We propose an optimization-via-simulation algorithm, called COMPASS, for use when the performance measure is estimated via a stochastic, discrete-event simulation, and the decision variables are integer ordered. We prove that COMPASS converges to t ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
informs ® doi 10.1287/opre.1050.0237 © 2006 INFORMS We propose an optimization-via-simulation algorithm, called COMPASS, for use when the performance measure is estimated via a stochastic, discrete-event simulation, and the decision variables are integer ordered. We prove that COMPASS converges to the set of local optimal solutions with probability 1 for both terminating and steady-state simulation, and for both fully constrained problems and partially constrained or unconstrained problems under mild conditions.
Efficient Inference for Mixed Bayesian Networks
- Proceedings of the 5th ISIF/IEEE International Conference on Information Fusion, 2002
, 2002
"... Bayesian network is a compact representation for probabilistic models and inference. They have been used successfully for multisensor fusion and situation assessment. It is well known that, in general, the inference algorithms to compute the exact posterior probability of the target state are either ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Bayesian network is a compact representation for probabilistic models and inference. They have been used successfully for multisensor fusion and situation assessment. It is well known that, in general, the inference algorithms to compute the exact posterior probability of the target state are either computationally infeasible for dense networks or impossible for mixed discretecontinuous networks. In those cases, one approach is to compute the approximate results using simulation methods. This paper proposes efficient inference methods for those cases. The goal is not to compute the exact or approximate posterior probability of the target state, but to identify the top (most likely) ones in an efficient manner. The approach is to use intelligent simulation techniques where previous samples will be used to guide the future sampling strategy. By focusing the sampling on the "important" space, we are able to sort out the top candidates quickly. Simulation results are included to demonstrate the performances of the algorithms.
Automated Configuration of Algorithms for Solving Hard Computational Problems
, 2009
"... The best-performing algorithms for many hard problems are highly parameterized. Selecting the best heuristics and tuning their parameters for optimal overall performance is often a difficult, tedious, and unsatisfying task. This thesis studies the automation of this important part of algorithm desig ..."
Abstract
-
Cited by 5 (5 self)
- Add to MetaCart
The best-performing algorithms for many hard problems are highly parameterized. Selecting the best heuristics and tuning their parameters for optimal overall performance is often a difficult, tedious, and unsatisfying task. This thesis studies the automation of this important part of algorithm design: the configuration of discrete algorithm components and their continuous parameters to construct an algorithm with desirable empirical performance characteristics. Automated configuration procedures can facilitate algorithm development and be applied on the end user side to optimize performance for new instance types and optimization objectives. The use of such procedures separates high-level cognitive tasks carried out by humans from tedious low-level tasks that can be left to machines. We introduce two alternative algorithm configuration frameworks: iterated local search in parameter configuration space and sequential optimization based on response surface models. To the best of our knowledge, our local search approach is the first that goes beyond local optima. Our model-based search techniques significantly outperform existing techniques and extend them in ways crucial for general algorithm configuration: they can handle categorical parameters, optimization objectives defined across multiple instances, and tens of thousands
Optimal Learning
, 2008
"... Optimal learning addresses the problem of efficiently collecting information with which to make decisions. These problems arise in both offline settings (making a series of measurements, after which a decision is made) and online settings (the process of making a decision results in observations tha ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Optimal learning addresses the problem of efficiently collecting information with which to make decisions. These problems arise in both offline settings (making a series of measurements, after which a decision is made) and online settings (the process of making a decision results in observations that change the distribution of belief about future observations). Optimal learning is an issue primarily in applications where observations or measurements are expensive. These include expensive simulations (where a single observation might take a day or more), laboratory sciences (testing a drug compound in a lab), and field experiments (testing a new energy saving technology in a building). This tutorial provides an introduction to this problem area, covering important dimensions of a learning problem and introducing a range of policies for collecting information.
Efficient Simulation-Based Composition of Scheduling Policies by Integrating Ordinal Optimization With Design of Experiment
"... Abstract—Semiconductor wafer fab operations are characterized by complex and reentrant production processes over many heterogeneous machine groups with stringent performance requirements. Efficient composition of good scheduling policies from combinatorial options of wafer release and machine dispat ..."
Abstract
- Add to MetaCart
Abstract—Semiconductor wafer fab operations are characterized by complex and reentrant production processes over many heterogeneous machine groups with stringent performance requirements. Efficient composition of good scheduling policies from combinatorial options of wafer release and machine dispatching rules has posed a significant challenge to competitive fab operations. In this paper, we design a fast simulation-based methodology by an innovative integration of ordinal optimization (OO) and design of experiments (DOEs) to efficiently select a good scheduling policy for fab operations. Instead of finding the exact performance among scheduling policies, our approach compares their relative orders of performance to a specified level of confidence. Our new approach consists of three stages: performance estimation model construction using DOE, policy option screening process, and final simulation evaluation with intelligent computing budget allocation. The exponential convergence of OO is integrated into all the three stages to significantly improve computational efficiency. Simulation results of applications to scheduling wafer fabrications not only screen out good scheduling policies but also provide insights about how factors such as wafer release and the dispatching of each machine group may affect production cycle times and smoothness under a reentrant process flow. Most of the OO-based DOE simulations require 2–3 orders of magnitude less computation time than those of a traditional approach. Such a high speedup enables decision makers to explore much larger problems. Note to Practitioners—This paper designs a fast simulation-based methodology to compose a good scheduling policy from various dispatching rules of fab operations. The methodology innovatively applies DOE to estimate performance of dispatching rule combinations (policies) over various machines groups in a fab, screens out good enough policy options by using OO over the performance estimation, and allocates computation time intelli-
A LARGE DEVIATIONS PERSPECTIVE ON ORDINAL OPTIMIZATION
"... We consider the problem of optimal allocation of computing budget to maximize the probability of correct selection in the ordinal optimization setting. This problem has been studied in the literature in an approximate mathematical framework under the assumption that the underlying random variables h ..."
Abstract
- Add to MetaCart
We consider the problem of optimal allocation of computing budget to maximize the probability of correct selection in the ordinal optimization setting. This problem has been studied in the literature in an approximate mathematical framework under the assumption that the underlying random variables have a Gaussian distribution. We use the large deviations theory to develop a mathematically rigorous framework for determining the optimal allocation of computing resources even when the underlying variables have general, non-Gaussian distributions. Further, in a simple setting we show that when there exists an indifference zone, quick stopping rules may be developed that exploit the exponential decay rates of the probability of false selection. In practice, the distributions of the underlying variables are estimated from generated samples leading to performance degradation due to estimation errors. On a positive note, we show that the corresponding estimates of optimal allocations converge to their true values as the number of samples used for estimation increases to infinity. 1
Proceedings of the 2002 Winter Simulation Conference
"... A simulation model is successful if it leads to policy action, i.e., if it is implemented. Studies show that for a model to be implemented, it must have good correspondence with the mental model of the system held by the user of the model. The user must feel confident that the simulation model corre ..."
Abstract
- Add to MetaCart
A simulation model is successful if it leads to policy action, i.e., if it is implemented. Studies show that for a model to be implemented, it must have good correspondence with the mental model of the system held by the user of the model. The user must feel confident that the simulation model corresponds to this mental model. An understanding of how the model works is required. Simulation models for implementation must be developed step by step, starting with a simple model, the simulation prototype. After this has been explained to the user, a more detailed model can be developed on the basis of feedback from the user. Software for simulation prototyping is discussed, e.g., with regard to the ease with which models and output can be explained and the speed with which small models can be written.

