Results 11  20
of
145
Algorithms for hyperparameter optimization
 In NIPS
, 2011
"... Several recent advances to the state of the art in image classification benchmarks have come from better configurations of existing techniques rather than novel approaches to feature learning. Traditionally, hyperparameter optimization has been the job of humans because they can be very efficient ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
Several recent advances to the state of the art in image classification benchmarks have come from better configurations of existing techniques rather than novel approaches to feature learning. Traditionally, hyperparameter optimization has been the job of humans because they can be very efficient in regimes where only a few trials are possible. Presently, computer clusters and GPU processors make it possible to run more trials and we show that algorithmic approaches can find better results. We present hyperparameter optimization results on tasks of training neural networks and deep belief networks (DBNs). We optimize hyperparameters using random search and two new greedy sequential methods based on the expected improvement criterion. Random search has been shown to be sufficiently efficient for learning neural networks for several datasets, but we show it is unreliable for training DBNs. The sequential algorithms are applied to the most difficult DBN learning problems from [1] and find significantly better results than the best previously reported. This work contributes novel techniques for making response surface models P (yx) in which many elements of hyperparameter assignment (x) are known to be irrelevant given particular values of other elements. 1
Gaussian Process Preference Elicitation
"... Bayesian approaches to preference elicitation (PE) are particularly attractive due to their ability to explicitly model uncertainty in users ’ latent utility functions. However, previous approaches to Bayesian PE have ignored the important problem of generalizing from previous users to an unseen use ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
Bayesian approaches to preference elicitation (PE) are particularly attractive due to their ability to explicitly model uncertainty in users ’ latent utility functions. However, previous approaches to Bayesian PE have ignored the important problem of generalizing from previous users to an unseen user in order to reduce the elicitation burden on new users. In this paper, we address this deficiency by introducing a Gaussian Process (GP) prior over users ’ latent utility functions on the joint space of user and item features. We learn the hyperparameters of this GP on a set of preferences of previous users and use it to aid in the elicitation process for a new user. This approach provides a flexible model of a multiuser utility function, facilitates an efficient value of information (VOI) heuristic query selection strategy, and provides a principled way to incorporate the elicitations of multiple users back into the model. We show the effectiveness of our method in comparison to previous work on a real dataset of user preferences over sushi types. 1
A.: Braincomputer evolutionary multiobjective optimization (BCEMO): a genetic algorithm adapting to the decision maker
 IEEE Transactions on Evolutionary Computation
, 2010
"... Abstract—The centrality of the decision maker (DM) is widely recognized in the multiple criteria decisionmaking community. This translates into emphasis on seamless human–computer interaction, and adaptation of the solution technique to the knowledge which is progressively acquired from the DM. Thi ..."
Abstract

Cited by 9 (7 self)
 Add to MetaCart
Abstract—The centrality of the decision maker (DM) is widely recognized in the multiple criteria decisionmaking community. This translates into emphasis on seamless human–computer interaction, and adaptation of the solution technique to the knowledge which is progressively acquired from the DM. This paper adopts the methodology of reactive search optimization (RSO) for evolutionary interactive multiobjective optimization. RSO follows to the paradigm of “learning while optimizing,” through the use of online machine learning techniques as an integral part of a selftuning optimization scheme. User judgments of couples of solutions are used to build robust incremental models of the user utility function, with the objective to reduce the cognitive burden required from the DM to identify a satisficing solution. The technique of support vector ranking is used together with a kfold crossvalidation procedure to select the best kernel for the problem at hand, during the utility function training procedure. Experimental results are presented for a series of benchmark problems. Index Terms—Interactive decision making, machine learning, reactive search optimization, support vector ranking. I.
Batch Bayesian Optimization via Simulation Matching
"... Bayesian optimization methods are often used to optimize unknown functions that are costly to evaluate. Typically, these methods sequentially select inputs to be evaluated one at a time based on a posterior over the unknown function that is updated after each evaluation. In many applications, howeve ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
Bayesian optimization methods are often used to optimize unknown functions that are costly to evaluate. Typically, these methods sequentially select inputs to be evaluated one at a time based on a posterior over the unknown function that is updated after each evaluation. In many applications, however, it is desirable to perform multiple evaluations in parallel, which requires selecting batches of multiple inputs to evaluate at once. In this paper, we propose a novel approach to batch Bayesian optimization, providing a policy for selecting batches of inputs with the goal of optimizing the function as efficiently as possible. The key idea is to exploit the availability of highquality and efficient sequential policies, by using MonteCarlo simulation to select input batches that closely match their expected behavior. Our experimental results on six benchmarks show that the proposed approach significantly outperforms two baselines and can lead to large advantages over a top sequential approach in terms of performance per unit time. 1
A multipoints criterion for deterministic parallel global optimization based on gaussian processes
 Journal of Global Optimization, in revision
, 2009
"... The optimization of expensivetoevaluate functions generally relies on metamodelbased exploration strategies. Many deterministic global optimization algorithms used in the field of computer experiments are based on Kriging (Gaussian process regression). Starting with a spatial predictor including a ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
The optimization of expensivetoevaluate functions generally relies on metamodelbased exploration strategies. Many deterministic global optimization algorithms used in the field of computer experiments are based on Kriging (Gaussian process regression). Starting with a spatial predictor including a measure of uncertainty, they proceed by iteratively choosing the point maximizing a criterion which is a compromise between predicted performance and uncertainty. Distributing the evaluation of such numerically expensive objective functions on many processors is an appealing idea. Here we investigate a multipoints optimization criterion, the multipoints expected improvement (qEI), aimed at choosing several points at the same time. An analytical expression of the qEI is given when q = 2, and a consistent statistical estimate is given for the general case. We then propose two classes of heuristic strategies meant to approximately optimize the qEI, and apply them to Gaussian Processes and to the classical BraninHoo testcase function. It is finally demonstrated within the covered example that the latter strategies perform as good as the best Latin Hypercubes and Uniform Designs ever found by simulation (2000 designs drawn at random for every q ∈ [1, 10]).
Portfolio Allocation for Bayesian Optimization
"... Bayesian optimization with Gaussian processes has become an increasingly popular tool in the machine learning community. It is efficient and can be used when very little is known about the objective function, making it popular in expensive blackbox optimization scenarios. It uses Bayesian methods t ..."
Abstract

Cited by 6 (4 self)
 Add to MetaCart
Bayesian optimization with Gaussian processes has become an increasingly popular tool in the machine learning community. It is efficient and can be used when very little is known about the objective function, making it popular in expensive blackbox optimization scenarios. It uses Bayesian methods to sample the objective efficiently using an acquisition function which incorporates the posterior estimate of the objective. However, there are several different parameterized acquisition functions in the literature, and it is often unclear which one to use. Instead of using a single acquisition function, we adopt a portfolio of acquisition functions governed by an online multiarmed bandit strategy. We propose several portfolio strategies, the best of which we call GPHedge, and show that this method outperforms the best individual acquisition function. We also provide a theoretical bound on the algorithm’s performance. 1
Using Response Surfaces and Expected Improvement to Optimize Snake Robot Gait Parameters
"... Abstract — Several categories of optimization problems suffer from expensive objective function evaluation, driving the need for smart selection of subsequent experiments. One such category of problems involves physical robotic systems, which often require significant time, effort, and monetary expe ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
Abstract — Several categories of optimization problems suffer from expensive objective function evaluation, driving the need for smart selection of subsequent experiments. One such category of problems involves physical robotic systems, which often require significant time, effort, and monetary expenditure in order to run tests. To assist in the selection of the next experiment, there has been a focus on the idea of response surfaces in recent years. These surfaces interpolate the existing data and provide a measure of confidence in their error, serving as a lowfidelity surrogate function that can be used to more intelligently choose the next experiment. In this paper, we robustly implement a previous algorithm based on the response surface methodology with an expected improvement criteria. We apply this technique to optimize openloop gait parameters for snake robots, and demonstrate improved locomotive capabilities. I.
A kriging based method for the solution of mixedinteger nonlinear programs containing blackbox functions
, 2009
"... ..."
Surrogating the surrogate: accelerating Gaussianprocessbased global optimization with a mixture crossentropy algorithm
"... In global optimization, when the evaluation of the target function is costly, the usual strategy is to learn a surrogate model for the target function and replace the initial optimization by the optimization of the model. Gaussian processes have been widely used since they provide an elegant way to ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
In global optimization, when the evaluation of the target function is costly, the usual strategy is to learn a surrogate model for the target function and replace the initial optimization by the optimization of the model. Gaussian processes have been widely used since they provide an elegant way to model the fitness and to deal with the explorationexploitation tradeoff in a principled way. Several empirical criteria have been proposed to drive the model optimization, among which is the wellknown Expected Improvement criterion. The major computational bottleneck of these algorithms is the exhaustive grid search used to optimize the highly multimodal merit function. In this paper, we propose a competitive “adaptive grid ” approach, based on a properly derived crossentropy optimization algorithm with mixture proposals. Experiments suggest that 1) we outperform the classical singleGaussian crossentropy method when the fitness function is highly multimodal, and 2) we improve on standard exhaustive search in GPbased surrogate optimization. 1.
Bayesian optimization for sensor set selection
 In Proceedings of the 9th ACM/IEEE International Conference on Information Processing in Sensor Networks
, 2010
"... We consider the problem of selecting an optimal set of sensors, as determined, for example, by the predictive accuracy of the resulting sensor network. Given an underlying metric between pairs of set elements, we introduce a natural metric between sets of sensors for this task. Using this metric, we ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
We consider the problem of selecting an optimal set of sensors, as determined, for example, by the predictive accuracy of the resulting sensor network. Given an underlying metric between pairs of set elements, we introduce a natural metric between sets of sensors for this task. Using this metric, we can construct covariance functions over sets, and thereby perform Gaussian process inference over a function whose domain is a power set. If the function has additional inputs, our covariances can be readily extended to incorporate them—allowing us to consider, for example, functions over both sets and time. These functions can then be optimized using Gaussian process global optimization (GPGO). We use the root mean squared error (RMSE) of the predictions made using a set of sensors at a particular time as an example of such a function to be optimized; the optimal point specifies the best choice of sensor locations. We demonstrate the resulting method by dynamically selecting the best subset of a given set of weather sensors for the prediction of the air temperature across the United Kingdom.