## Using Gaussian Processes to Optimize Expensive Functions.

Citations: | 6 - 0 self |

### BibTeX

@MISC{Frean_usinggaussian,

author = {Marcus Frean and Phillip Boyle},

title = {Using Gaussian Processes to Optimize Expensive Functions.},

year = {}

}

### OpenURL

### Abstract

Abstract. The task of finding the optimum of some function f(x) is commonly accomplished by generating and testing sample solutions iteratively, choosing each new sample x heuristically on the basis of results to date. We use Gaussian processes to represent predictions and uncertainty about the true function, and describe how to use these predictions to choose where to take each new sample in an optimal way. By doing this we were able to solve a difficult optimization problem- finding weights in a neural network controller to simultaneously balance two vertical poles- using an order of magnitude fewer samples than reported elsewhere. 1

### Citations

4823 |
Neural Networks for Pattern Recognition
- Bishop
- 1995
(Show Context)
Citation Context ...n’s [5] application to one-dimensional curve fitting. Buntine [6], MacKay [7], and Neal [8] introduced a Bayesian interpretation that provided a consistent method for handling network complexity (see =-=[9,10]-=- for reviews), followed by regression in a machine learning context [11–13]. See [14–16] for good introductions. Interesting machine learning applications include reinforcement learning [17], incorpor... |

1867 |
Numerical recipes in C: the art of scientific computing
- PRESS
- 1993
(Show Context)
Citation Context ... xnew to evaluate by finding a point that maximises the expected improvement. This can be achieved by using the gradient of the expected improvement as input to (eg.) the conjugate gradient algorithm =-=[23]-=-. To overcome the problem of suboptimal local maxima, multiple restarts are made starting from randomly selected points in the current data set X. The new observation ynew is found from f ∗ (xnew) and... |

1156 | Information Theory, Inference, and Learning Algorithms - MacKay - 2005 |

1110 |
Statistics for Spatial Data
- Cressie
- 1993
(Show Context)
Citation Context ...aussian process regression is a machine learning technique for infering likely values of y for a novel input x. The study of Gaussian processes for prediction began in geostatistics with kriging [3], =-=[4]-=- and O’Hagan’s [5] application to one-dimensional curve fitting. Buntine [6], MacKay [7], and Neal [8] introduced a Bayesian interpretation that provided a consistent method for handling network compl... |

324 | Evolving neural networks through augmenting topologies
- Stanley, Miikkulainen
- 2002
(Show Context)
Citation Context ...as possible, one solution is to wiggle the poles back and forth about a central position. To prevent this,sGruau [27] defined a fitness function that penalises such solutions, fgruau = 0.1f1 + 0.9f2, =-=[24,26]-=-. The two components are defined over 1000 time steps (10 seconds simulated time): f1 = t/1000 (2) � 0 if t < 100, f2 = 0.75 (3) otherwise. Pt i=t−100 (|xi |+| ˙x i |+|θ1|+| ˙ θ1|) where t is the numb... |

222 | Gaussian processes for regression - Williams, Rasmussen - 1996 |

150 | Bayesian methods for adaptive models
- MacKay
- 1992
(Show Context)
Citation Context ... y for a novel input x. The study of Gaussian processes for prediction began in geostatistics with kriging [3], [4] and O’Hagan’s [5] application to one-dimensional curve fitting. Buntine [6], MacKay =-=[7]-=-, and Neal [8] introduced a Bayesian interpretation that provided a consistent method for handling network complexity (see [9,10] for reviews), followed by regression in a machine learning context [11... |

140 | Evaluation of Gaussian Processes and Other Methods for Non-Linear Regression - Rasmussen - 1996 |

139 | Probable networks and plausible predictions -- a review of practical Bayesian methods for supervised neural networks
- MacKay
- 1995
(Show Context)
Citation Context ...n’s [5] application to one-dimensional curve fitting. Buntine [6], MacKay [7], and Neal [8] introduced a Bayesian interpretation that provided a consistent method for handling network complexity (see =-=[9,10]-=- for reviews), followed by regression in a machine learning context [11–13]. See [14–16] for good introductions. Interesting machine learning applications include reinforcement learning [17], incorpor... |

138 | A comparison between cellular encoding and direct encoding for genetic neural networks
- Gruau, Whitley, et al.
- 1996
(Show Context)
Citation Context ...tanley and Miikkulainen [24–26]. If the goal is to keep the poles balanced for as long as possible, one solution is to wiggle the poles back and forth about a central position. To prevent this,sGruau =-=[27]-=- defined a fitness function that penalises such solutions, fgruau = 0.1f1 + 0.9f2, [24,26]. The two components are defined over 1000 time steps (10 seconds simulated time): f1 = t/1000 (2) � 0 if t < ... |

123 |
Bayesian back-propagation
- Buntine, Weigend
- 1991
(Show Context)
Citation Context ...ly values of y for a novel input x. The study of Gaussian processes for prediction began in geostatistics with kriging [3], [4] and O’Hagan’s [5] application to one-dimensional curve fitting. Buntine =-=[6]-=-, MacKay [7], and Neal [8] introduced a Bayesian interpretation that provided a consistent method for handling network complexity (see [9,10] for reviews), followed by regression in a machine learning... |

122 | Carlo Implementation of Gaussian Process Models for Bayesian Regression and Classification - Neal, “Monte - 1997 |

121 | A taxonomy of global optimization methods based on response surfaces
- Jones
- 2001
(Show Context)
Citation Context ...on of robotic control systems. In the response surface methodology [1] we construct a response surface and search that surface for likely candidate points, measured according to some criterion. Jones =-=[2]-=- provides a summary of many such methods and discusses their relative merits. As a simple example, consider a noiseless optimisation problem where, given an intial set of samples, we proceed as follow... |

84 | Gaussian process dynamical models
- Wang, Fleet, et al.
- 2006
(Show Context)
Citation Context ... learning applications include reinforcement learning [17], incorporation of derivative observations [18], speeding up the evaluation of Bayesian integrals [19,20], and as models of dynamical systems =-=[21]-=-. The key assumption is that the posterior distribution p(y|x, D) is Gaussian. To compute its mean and variance, one specifies a valid covariance function cov(x,x ′ ), and defines vector k where ki = ... |

52 | Efficient reinforcement learning through evolving neural network topologies - Stanley, Miikkulainen - 2002 |

47 | Efficient Evolution of Neural Networks through Complexification
- Stanley, Miikkulainen
- 2004
(Show Context)
Citation Context ...as possible, one solution is to wiggle the poles back and forth about a central position. To prevent this,sGruau [27] defined a fitness function that penalises such solutions, fgruau = 0.1f1 + 0.9f2, =-=[24,26]-=-. The two components are defined over 1000 time steps (10 seconds simulated time): f1 = t/1000 (2) � 0 if t < 100, f2 = 0.75 (3) otherwise. Pt i=t−100 (|xi |+| ˙x i |+|θ1|+| ˙ θ1|) where t is the numb... |

43 |
Principles of geostatistics Economic Geology Vol
- Matheron
- 1963
(Show Context)
Citation Context ...N}, Gaussian process regression is a machine learning technique for infering likely values of y for a novel input x. The study of Gaussian processes for prediction began in geostatistics with kriging =-=[3]-=-, [4] and O’Hagan’s [5] application to one-dimensional curve fitting. Buntine [6], MacKay [7], and Neal [8] introduced a Bayesian interpretation that provided a consistent method for handling network ... |

43 | Derivative observations in Gaussian Process models of dynamic Systems
- Solak, Murray-Smith, et al.
- 2003
(Show Context)
Citation Context ...sion in a machine learning context [11–13]. See [14–16] for good introductions. Interesting machine learning applications include reinforcement learning [17], incorporation of derivative observations =-=[18]-=-, speeding up the evaluation of Bayesian integrals [19,20], and as models of dynamical systems [21]. The key assumption is that the posterior distribution p(y|x, D) is Gaussian. To compute its mean an... |

38 | Bayesian Training of Backpropagation Networks by the Hybrid Monte Carlo Method
- Neal
- 1992
(Show Context)
Citation Context ... input x. The study of Gaussian processes for prediction began in geostatistics with kriging [3], [4] and O’Hagan’s [5] application to one-dimensional curve fitting. Buntine [6], MacKay [7], and Neal =-=[8]-=- introduced a Bayesian interpretation that provided a consistent method for handling network complexity (see [9,10] for reviews), followed by regression in a machine learning context [11–13]. See [14–... |

37 | Accelerating evolutionary algorithms with gaussian process fitness function models
- Buche, Schraudolph, et al.
- 2005
(Show Context)
Citation Context ...f the search surface. Further details are given in [20]. Jones [2] first introduced kriging for optimisation using expected improvement to select the next iterate. Büche, Schraudolph and Koumoutsakos =-=[22]-=- explicitly used Gaussian processes for optimisation, and demonstrated the algorithm’s effectiveness on a number of benchmark problems. This work did not make use of expected improvement, did not plac... |

35 |
Curve fitting and optimal design for prediction (with discussion
- O’Hagan
- 1985
(Show Context)
Citation Context ...gression is a machine learning technique for infering likely values of y for a novel input x. The study of Gaussian processes for prediction began in geostatistics with kriging [3], [4] and O’Hagan’s =-=[5]-=- application to one-dimensional curve fitting. Buntine [6], MacKay [7], and Neal [8] introduced a Bayesian interpretation that provided a consistent method for handling network complexity (see [9,10] ... |

34 | Gaussian processes in reinforcement learning
- Rasmussen, Kuss
- 2004
(Show Context)
Citation Context ...ty (see [9,10] for reviews), followed by regression in a machine learning context [11–13]. See [14–16] for good introductions. Interesting machine learning applications include reinforcement learning =-=[17]-=-, incorporation of derivative observations [18], speeding up the evaluation of Bayesian integrals [19,20], and as models of dynamical systems [21]. The key assumption is that the posterior distributio... |

22 | Automatic gait optimization with Gaussian process regression
- Lizotte, Wang, et al.
- 2007
(Show Context)
Citation Context ... of using an axis-aligned covariance function to optimise objective functions with correlated output (dependent) variables. The algorithm presented here takes all these factors into account. Recently =-=[28]-=- have used similar ideas to those presented here to optimize the gait of a mobile robot, although they use a different criterion (probability of any improvement) and don’t deal with correlated variabl... |

16 |
Gaussian processes to speed up hybrid Monte Carlo for expensive Bayesian integrals
- Rasmussen
- 2003
(Show Context)
Citation Context ... for good introductions. Interesting machine learning applications include reinforcement learning [17], incorporation of derivative observations [18], speeding up the evaluation of Bayesian integrals =-=[19,20]-=-, and as models of dynamical systems [21]. The key assumption is that the posterior distribution p(y|x, D) is Gaussian. To compute its mean and variance, one specifies a valid covariance function cov(... |

7 | Bayesian Gaussian Processes for Classification and Regression - Gibbs - 1997 |

5 | Gaussian Processes for Regression and Optimisation
- Boyle
- 2007
(Show Context)
Citation Context ... for good introductions. Interesting machine learning applications include reinforcement learning [17], incorporation of derivative observations [18], speeding up the evaluation of Bayesian integrals =-=[19,20]-=-, and as models of dynamical systems [21]. The key assumption is that the posterior distribution p(y|x, D) is Gaussian. To compute its mean and variance, one specifies a valid covariance function cov(... |