Results

**11 - 20**of**20**### The Parallel Knowledge Gradient Method for Batch Bayesian Optimization

"... Abstract In many applications of black-box optimization, one can evaluate multiple points simultaneously, e.g. when evaluating the performances of several different neural networks in a parallel computing environment. In this paper, we develop a novel batch Bayesian optimization algorithm -the para ..."

Abstract
- Add to MetaCart

(Show Context)
Abstract In many applications of black-box optimization, one can evaluate multiple points simultaneously, e.g. when evaluating the performances of several different neural networks in a parallel computing environment. In this paper, we develop a novel batch Bayesian optimization algorithm -the parallel knowledge gradient method. By construction, this method provides the one-step Bayes optimal batch of points to sample. We provide an efficient strategy for computing this Bayes-optimal batch of points, and we demonstrate that the parallel knowledge gradient method finds global optima significantly faster than previous batch Bayesian optimization algorithms on both synthetic test functions and when tuning hyperparameters of practical machine learning algorithms, especially when function evaluations are noisy.

### ETH Zurich

"... Many information gathering problems require determining the set of points, for which an unknown function takes value above or below some given threshold level. We formalize this task as a classification problem with sequential measurements, where the unknown function is modeled as a sample from a Ga ..."

Abstract
- Add to MetaCart

Many information gathering problems require determining the set of points, for which an unknown function takes value above or below some given threshold level. We formalize this task as a classification problem with sequential measurements, where the unknown function is modeled as a sample from a Gaussian process (GP). We propose LSE, an algorithm that guides both sampling and classification based on GP-derived confidence bounds, and provide theoretical guarantees about its sample complexity. Furthermore, we extend LSE and its theory to two more natural settings: (1) where the threshold level is implicitly defined as a percentage of the (unknown) maximum of the target function and (2) where samples are selected in batches. We evaluate the effectiveness of our proposed methods on two problems of practical interest, namely autonomous monitoring of algal populations in a lake environment and geolocating network latency. 1

### and Regression

, 2014

"... Most existing GP regression algorithms assume a single generative model, leading to poor performance when data are nonstationary, i.e. generated from multiple switching processes. Existing methods for GP regression over non-stationary data include clustering and change-point detection algorithms. Ho ..."

Abstract
- Add to MetaCart

(Show Context)
Most existing GP regression algorithms assume a single generative model, leading to poor performance when data are nonstationary, i.e. generated from multiple switching processes. Existing methods for GP regression over non-stationary data include clustering and change-point detection algorithms. However, these methods require significant computation, do not come with provable guarantees on correctness and speed, and most algorithms only work in batch settings. This thesis presents an efficient online GP framework, GP-NBC, that leverages the generalized likelihood ratio test to detect changepoints and learn multiple Gaussian Process models from streaming data. Furthermore, GP-NBC can quickly recog-nize and reuse previously seen models. The algorithm is shown to be theoretically sample efficient in terms of limiting mistaken predictions. Our empirical results on two real-world datasets and one synthetic dataset show GP-NBC outperforms state of the art methods for nonstationary regression in terms of regression error and computational efficiency.

### A Study of Asynchronous Budgeted Optimization

"... Budgeted optimization algorithms based on Gaussian processes have attracted a lot of attention as a way to deal with computationally expensive cost functions. Recently, parallel versions of these algorithms have further improved their ability to address the computation cost bottleneck. This article ..."

Abstract
- Add to MetaCart

(Show Context)
Budgeted optimization algorithms based on Gaussian processes have attracted a lot of attention as a way to deal with computationally expensive cost functions. Recently, parallel versions of these algorithms have further improved their ability to address the computation cost bottleneck. This article focuses on those algorithms that maximize the multi-point expected improvement criterion. Synchronous and asynchronous parallel versions of such algorithms are compared. It is shown that asynchronous algorithms are slower than the synchronous ones iteration-wise. In terms of wall clock time however and contrarily to synchronous algorithms, asynchronous implementations benefit from a near-linear speed-up up to the order of 100 computing nodes. This speed-up is bounded by the maximization of the expected improvement time which, by becoming a critical time limitation, may change the way researchers look at budgeted optimization methods. 1

### Batch Selection for Parallel Gaussian Process Bandits

"... This paper introduces Multiarmed Bandits, Gaussian processes and their combi-nation, Gaussian Process Bandits as well as the problem of selecting candidates for batches when evaluating them in parallel. In addition to the standard GP-UCB successive selection case, two recent algorithms, GP-BUCB and ..."

Abstract
- Add to MetaCart

This paper introduces Multiarmed Bandits, Gaussian processes and their combi-nation, Gaussian Process Bandits as well as the problem of selecting candidates for batches when evaluating them in parallel. In addition to the standard GP-UCB successive selection case, two recent algorithms, GP-BUCB and GP-UCB-PE, are introduced, explained and evaluated. 1

### Discovering Valuable Items from Massive Data

"... Suppose there is a large collection of items, each with an as-sociated cost and an inherent utility that is revealed only once we commit to selecting it. Given a budget on the cumulative cost of the selected items, how can we pick a subset of maximal value? This task generalizes several im-portant p ..."

Abstract
- Add to MetaCart

(Show Context)
Suppose there is a large collection of items, each with an as-sociated cost and an inherent utility that is revealed only once we commit to selecting it. Given a budget on the cumulative cost of the selected items, how can we pick a subset of maximal value? This task generalizes several im-portant problems such as multi-arm bandits, active search and the knapsack problem. We present an algorithm, GP-Select, which utilizes prior knowledge about similarity be-tween items, expressed as a kernel function. GP-Select uses Gaussian process prediction to balance exploration (es-timating the unknown value of items) and exploitation (se-lecting items of high value). We extend GP-Select to be able to discover sets that simultaneously have high utility and are diverse. Our preference for diversity can be spec-ified as an arbitrary monotone submodular function that quantifies the diminishing returns obtained when selecting similar items. Furthermore, we exploit the structure of the model updates to achieve an order of magnitude (up to 40X) speedup in our experiments without resorting to approxima-tions. We provide strong guarantees on the performance of GP-Select and apply it to three real-world case studies of industrial relevance: (1) Refreshing a repository of prices in a Global Distribution System for the travel industry, (2) Identifying diverse, binding-affine peptides in a vaccine de-sign task and (3) Maximizing clicks in a web-scale recom-mender system by recommending items to users.