Results 1 - 10
of
217
Reinforcement learning: a survey
- Journal of Artificial Intelligence Research
, 1996
"... This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem ..."
Abstract
-
Cited by 1134 (21 self)
- Add to MetaCart
This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word "reinforcement." The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.
Active Learning with Statistical Models
, 1995
"... For manytypes of learners one can compute the statistically "optimal" way to select data. We review how these techniques have been used with feedforward neural networks [MacKay, 1992# Cohn, 1994]. We then showhow the same principles may be used to select data for two alternative, statistically-bas ..."
Abstract
-
Cited by 402 (7 self)
- Add to MetaCart
For manytypes of learners one can compute the statistically "optimal" way to select data. We review how these techniques have been used with feedforward neural networks [MacKay, 1992# Cohn, 1994]. We then showhow the same principles may be used to select data for two alternative, statistically-based learning architectures: mixtures of Gaussians and locally weighted regression. While the techniques for neural networks are expensive and approximate, the techniques for mixtures of Gaussians and locally weighted regression are both efficient and accurate.
Optimization by direct search: New perspectives on some classical and modern methods
- SIAM Review
, 2003
"... Abstract. Direct search methods are best known as unconstrained optimization techniques that do not explicitly use derivatives. Direct search methods were formally proposed and widely applied in the 1960s but fell out of favor with the mathematical optimization community by the early 1970s because t ..."
Abstract
-
Cited by 72 (14 self)
- Add to MetaCart
Abstract. Direct search methods are best known as unconstrained optimization techniques that do not explicitly use derivatives. Direct search methods were formally proposed and widely applied in the 1960s but fell out of favor with the mathematical optimization community by the early 1970s because they lacked coherent mathematical analysis. Nonetheless, users remained loyal to these methods, most of which were easy to program, some of which were reliable. In the past fifteen years, these methods have seen a revival due, in part, to the appearance of mathematical analysis, as well as to interest in parallel and distributed computing. This review begins by briefly summarizing the history of direct search methods and considering the special properties of problems for which they are well suited. Our focus then turns to a broad class of methods for which we provide a unifying framework that lends itself to a variety of convergence results. The underlying principles allow generalization to handle bound constraints and linear constraints. We also discuss extensions to problems with nonlinear constraints.
An Introduction to Regression Graphics
, 1994
"... This article, which is based on an Interface tutorial, presents an overview of regression graphics, along with an annotated bibliography. The intent is to discuss basic ideas and issues without delving into methodological or theoretical details, and to provide a guide to the literature. 1 ..."
Abstract
-
Cited by 56 (8 self)
- Add to MetaCart
This article, which is based on an Interface tutorial, presents an overview of regression graphics, along with an annotated bibliography. The intent is to discuss basic ideas and issues without delving into methodological or theoretical details, and to provide a guide to the literature. 1
Optimization via simulation: a review
- Annals of Operations Research
, 1994
"... We review techniques for optimizing stochastic discrete-event systems via simulation. We discuss both the discrete parameter case and the continuous parameter case, but concentrate on the latter which has dominated most of the recent research in the area. For the discrete parameter case, we focus on ..."
Abstract
-
Cited by 52 (16 self)
- Add to MetaCart
We review techniques for optimizing stochastic discrete-event systems via simulation. We discuss both the discrete parameter case and the continuous parameter case, but concentrate on the latter which has dominated most of the recent research in the area. For the discrete parameter case, we focus on the techniques for optimization from a finite set: multiple-comparison procedures and ranking-and-selection procedures. For the continuous parameter case, we focus on gradient-based methods, including perturbation analysis, the likelihood ratio method, and frequency domain experimentation. For illustrative purposes, we compare and contrast the implementation of the techniques for some simple discrete-event systems such as the (s, S) inventory system and the GI/G/1 queue. Finally, we speculate on future directions for the field, particularly in the context of the rapid advances being made in parallel computing.
Memory-based Stochastic Optimization
- Neural Information Processing Systems 8
, 1995
"... In this paper we introduce new algorithms for optimizing noisy plants in which each experiment is very expensive. The algorithms build a global non-linear model of the expected output at the same time as using Bayesian linear regression analysis of locally weighted polynomial models. The local model ..."
Abstract
-
Cited by 35 (7 self)
- Add to MetaCart
In this paper we introduce new algorithms for optimizing noisy plants in which each experiment is very expensive. The algorithms build a global non-linear model of the expected output at the same time as using Bayesian linear regression analysis of locally weighted polynomial models. The local model answers queries about confidence, noise, gradient and Hessians, and use them to make automated decisions similar to those made by a practitioner of Response Surface Methodology. The global and local models are combined naturally as a locally weighted regression. We examine the question of whether the global model can really help optimization, and we extend it to the case of time-varying functions. We compare the new algorithms with a highly tuned higher-order stochastic optimization algorithm on randomly-generated functions and a simulated manufacturing task. We note significant improvements in total regret, time to converge, and final solution quality. 1 INTRODUCTION In a stochastic optim...
Some New Three Level Designs for the Study of Quantitative Variables
, 1960
"... This article describes some methods which enable us to construct small designs for quantitative factors, while maintaining as much orthogonality of the design as possible. To calculate the D- ..."
Abstract
-
Cited by 27 (0 self)
- Add to MetaCart
This article describes some methods which enable us to construct small designs for quantitative factors, while maintaining as much orthogonality of the design as possible. To calculate the D-
A new approach to the construction of optimal designs
- J. Statistical Planning and Inference
, 1993
"... By combining a modified version of Hooke and Jeeves ’ pattern search with exact or Monte Carlo moment calculations, it is possible to find I-, D- and A-optimal (or nearly optimal) designs for a wide range of response-surface problems. The algorithm routinely handles problems involving the minimizati ..."
Abstract
-
Cited by 26 (8 self)
- Add to MetaCart
By combining a modified version of Hooke and Jeeves ’ pattern search with exact or Monte Carlo moment calculations, it is possible to find I-, D- and A-optimal (or nearly optimal) designs for a wide range of response-surface problems. The algorithm routinely handles problems involving the minimization of functions of 1000 variables, and so for example can construct designs for a full quadratic response-surface depending on 12 continuous process variables. The algorithm handles continuous or discrete variables, linear equality or inequality constraints, and a response surface that is any low degree polynomial. The design may be required to include a specified set of points, so a sequence of designs can be obtained, each optimal given that the earlier runs have been made. The modeling region need not coincide with the measurement region. The algorithm has been implemented in a program called gosset, which has been used to compute extensive tables of designs. Many of these are more efficient than the best designs previously known.
Operating Regime Based Process Modeling and Identification
- COMPUTERS AND CHEMICAL ENGINEERING
, 1994
"... This paper presents a non-linear modeling framework that supports model development "in between" empirical and mechanistic modeling. A model is composed of a number of local models valid in different operating regimes. The local models are combined by smooth interpolation into a complete global m ..."
Abstract
-
Cited by 25 (12 self)
- Add to MetaCart
This paper presents a non-linear modeling framework that supports model development "in between" empirical and mechanistic modeling. A model is composed of a number of local models valid in different operating regimes. The local models are combined by smooth interpolation into a complete global model. It is illustrated how different kinds of empirical and mechanistic knowledge and models can be combined with process data within this framework. Furthermore, we describe a flexible computer aided modeling tool that supports modeling within this framework. Simple but illustrative examples from chemical engineering are used to highlight the flexibility and power of the framework.

