Results 1  10
of
620
Reinforcement learning: a survey
 Journal of Artificial Intelligence Research
, 1996
"... This paper surveys the field of reinforcement learning from a computerscience perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem ..."
Abstract

Cited by 1586 (25 self)
 Add to MetaCart
(Show Context)
This paper surveys the field of reinforcement learning from a computerscience perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trialanderror interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word "reinforcement." The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.
Active Learning with Statistical Models
, 1995
"... For manytypes of learners one can compute the statistically "optimal" way to select data. We review how these techniques have been used with feedforward neural networks [MacKay, 1992# Cohn, 1994]. We then showhow the same principles may be used to select data for two alternative, statist ..."
Abstract

Cited by 648 (12 self)
 Add to MetaCart
(Show Context)
For manytypes of learners one can compute the statistically "optimal" way to select data. We review how these techniques have been used with feedforward neural networks [MacKay, 1992# Cohn, 1994]. We then showhow the same principles may be used to select data for two alternative, statisticallybased learning architectures: mixtures of Gaussians and locally weighted regression. While the techniques for neural networks are expensive and approximate, the techniques for mixtures of Gaussians and locally weighted regression are both efficient and accurate.
Sliced inverse regression for dimension reduction
 J. AMER. STATIST. ASSOC
, 1991
"... ..."
(Show Context)
Optimization by Direct Search: New Perspectives on Some Classical and Modern Methods
 SIAM REVIEW VOL. 45, NO. 3, PP. 385–482
, 2003
"... Direct search methods are best known as unconstrained optimization techniques that do not explicitly use derivatives. Direct search methods were formally proposed and widely applied in the 1960s but fell out of favor with the mathematical optimization community by the early 1970s because they lacked ..."
Abstract

Cited by 204 (14 self)
 Add to MetaCart
Direct search methods are best known as unconstrained optimization techniques that do not explicitly use derivatives. Direct search methods were formally proposed and widely applied in the 1960s but fell out of favor with the mathematical optimization community by the early 1970s because they lacked coherent mathematical analysis. Nonetheless, users remained loyal to these methods, most of which were easy to program, some of which were reliable. In the past fifteen years, these methods have seen a revival due, in part, to the appearance of mathematical analysis, as well as to interest in parallel and distributed computing. This review begins by briefly summarizing the history of direct search methods and considering the special properties of problems for which they are well suited. Our focus then turns to a broad class of methods for which we provide a unifying framework that lends itself to a variety of convergence results. The underlying principles allow generalization to handle bound constraints and linear constraints. We also discuss extensions to problems with nonlinear constraints.
An Introduction to Regression Graphics
, 1994
"... This article, which is based on an Interface tutorial, presents an overview of regression graphics, along with an annotated bibliography. The intent is to discuss basic ideas and issues without delving into methodological or theoretical details, and to provide a guide to the literature. 1 ..."
Abstract

Cited by 113 (12 self)
 Add to MetaCart
This article, which is based on an Interface tutorial, presents an overview of regression graphics, along with an annotated bibliography. The intent is to discuss basic ideas and issues without delving into methodological or theoretical details, and to provide a guide to the literature. 1
Some New Three Level Designs for the Study of Quantitative Variables
, 1960
"... This article describes some methods which enable us to construct small designs for quantitative factors, while maintaining as much orthogonality of the design as possible. To calculate the D ..."
Abstract

Cited by 95 (0 self)
 Add to MetaCart
This article describes some methods which enable us to construct small designs for quantitative factors, while maintaining as much orthogonality of the design as possible. To calculate the D
hypercube sampling and propagation of uncertainty in analyses of complex systems, in: Sandia Report SAND20010417
, 2001
"... ..."
Optimization via simulation: a review
 Annals of Operations Research
, 1994
"... We review techniques for optimizing stochastic discreteevent systems via simulation. We discuss both the discrete parameter case and the continuous parameter case, but concentrate on the latter which has dominated most of the recent research in the area. For the discrete parameter case, we focus on ..."
Abstract

Cited by 79 (21 self)
 Add to MetaCart
(Show Context)
We review techniques for optimizing stochastic discreteevent systems via simulation. We discuss both the discrete parameter case and the continuous parameter case, but concentrate on the latter which has dominated most of the recent research in the area. For the discrete parameter case, we focus on the techniques for optimization from a finite set: multiplecomparison procedures and rankingandselection procedures. For the continuous parameter case, we focus on gradientbased methods, including perturbation analysis, the likelihood ratio method, and frequency domain experimentation. For illustrative purposes, we compare and contrast the implementation of the techniques for some simple discreteevent systems such as the (s, S) inventory system and the GI/G/1 queue. Finally, we speculate on future directions for the field, particularly in the context of the rapid advances being made in parallel computing.
Flexibility and Efficiency Enhancements for Constrained Global Design Optimization with Kriging Approximations
, 2002
"... ..."
Memorybased Stochastic Optimization
 Neural Information Processing Systems 8
, 1995
"... In this paper we introduce new algorithms for optimizing noisy plants in which each experiment is very expensive. The algorithms build a global nonlinear model of the expected output at the same time as using Bayesian linear regression analysis of locally weighted polynomial models. The local model ..."
Abstract

Cited by 50 (7 self)
 Add to MetaCart
In this paper we introduce new algorithms for optimizing noisy plants in which each experiment is very expensive. The algorithms build a global nonlinear model of the expected output at the same time as using Bayesian linear regression analysis of locally weighted polynomial models. The local model answers queries about confidence, noise, gradient and Hessians, and use them to make automated decisions similar to those made by a practitioner of Response Surface Methodology. The global and local models are combined naturally as a locally weighted regression. We examine the question of whether the global model can really help optimization, and we extend it to the case of timevarying functions. We compare the new algorithms with a highly tuned higherorder stochastic optimization algorithm on randomlygenerated functions and a simulated manufacturing task. We note significant improvements in total regret, time to converge, and final solution quality. 1 INTRODUCTION In a stochastic optim...