Results 1  10
of
10
Bayesian optimization with unknown constraints
 In International Conference on Artificial Intelligence and Statistics
, 2014
"... Recent work on Bayesian optimization has shown its effectiveness in global optimization of difficult blackbox objective functions. Many realworld optimization problems of interest also have constraints which are unknown a priori. In this paper, we study Bayesian optimization for constrained proble ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
(Show Context)
Recent work on Bayesian optimization has shown its effectiveness in global optimization of difficult blackbox objective functions. Many realworld optimization problems of interest also have constraints which are unknown a priori. In this paper, we study Bayesian optimization for constrained problems in the general case that noise may be present in the constraint functions, and the objective and constraints may be evaluated independently. We provide motivating practical examples, and present a general framework to solve such problems. We demonstrate the effectiveness of our approach on optimizing the performance of online latent Dirichlet allocation subject to topic sparsity constraints, tuning a neural network given testtime memory constraints, and optimizing Hamiltonian Monte Carlo to achieve maximal effectiveness in a fixed time, subject to passing standard convergence diagnostics. 1
1RELEAF: An Algorithm for Learning and Exploiting Relevance
"... Abstract—Recommender systems, medical diagnosis, network security, etc., require ongoing learning and decisionmaking in real time. These – and many others – represent perfect examples of the opportunities and difficulties presented by Big Data: the available information often arrives from a variet ..."
Abstract
 Add to MetaCart
Abstract—Recommender systems, medical diagnosis, network security, etc., require ongoing learning and decisionmaking in real time. These – and many others – represent perfect examples of the opportunities and difficulties presented by Big Data: the available information often arrives from a variety of sources and has diverse features so that learning from all the sources may be valuable but integrating what is learned is subject to the curse of dimensionality. This paper develops and analyzes algorithms that allow efficient learning and decisionmaking while avoiding the curse of dimensionality. We formalize the information available to the learner/decisionmaker at a particular time as a context vector which the learner should consider when taking actions. In general the context vector is very high dimensional, but in many settings, the most relevant information is embedded into only a few relevant dimensions. If these relevant dimensions were known in advance, the problem would be simple – but they are not. Moreover, the relevant dimensions may be different for different actions. Our algorithm learns the relevant dimensions for each action, and makes decisions based in what it has learned. Formally, we build on the structure of a contextual multiarmed bandit by adding and exploiting a relevance relation. We prove a general regret bound for our algorithm whose time order depends only on the maximum number of relevant dimensions among all the actions, which in the special case where the relevance relation is singlevalued (a function), reduces to Õ(T 2( 2−1)); in the absence of a relevance relation, the best known contextual bandit algorithms achieve regret Õ(T (D+1)/(D+2)), where D is the full dimension of the context vector. Our algorithm alternates between exploring and exploiting and does not require observing outcomes during exploitation (so allows for active learning). Moreover, during exploitation, suboptimal actions are chosen with arbitrarily low probability. Our algorithm is tested on datasets arising from network security and online news article recommendations. Index Terms—Contextual bandits, regret, dimensionality reduction, learning relevance, recommender systems, online learning, active learning. I.
Gains and Losses are Fundamentally Different in Regret Minimization: The Sparse Case
, 2016
"... Abstract We demonstrate that, in the classical nonstochastic regret minimization problem with d decisions, gains and losses to be respectively maximized or minimized are fundamentally different. Indeed, by considering the additional sparsity assumption (at each stage, at most s decisions incur a n ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract We demonstrate that, in the classical nonstochastic regret minimization problem with d decisions, gains and losses to be respectively maximized or minimized are fundamentally different. Indeed, by considering the additional sparsity assumption (at each stage, at most s decisions incur a nonzero outcome), we derive optimal regret bounds of different orders. Specifically, with gains, we obtain an optimal regret guarantee after T stages of order √ T log s, so the classical dependency in the dimension is replaced by the sparsity size. With losses, we provide matching upper and lower bounds of order T s log(d)/d, which is decreasing in d. Eventually, we also study the bandit setting, and obtain an upper bound of order T s log(d/s) when outcomes are losses. This bound is proven to be optimal up to the logarithmic factor log(d/s).
Discovering, Learning and Exploiting Relevance
"... Abstract In this paper we consider the problem of learning online what is the information to consider when making sequential decisions. We formalize this as a contextual multiarmed bandit problem where a high dimensional (Ddimensional) context vector arrives to a learner which needs to select an ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract In this paper we consider the problem of learning online what is the information to consider when making sequential decisions. We formalize this as a contextual multiarmed bandit problem where a high dimensional (Ddimensional) context vector arrives to a learner which needs to select an action to maximize its expected reward at each time step. Each dimension of the context vector is called a type. We assume that there exists an unknown relation between actions and types, called the relevance relation, such that the reward of an action only depends on the contexts of the relevant types. When the relation is a function, i.e., the reward of an action only depends on the context of a single type, and the expected reward of an action is Lipschitz continuous in the context of its relevant type, we propose an algorithm that achievesÕ(T γ ) regret with a high probability, where γ = 2/(1 + √ 2). Our algorithm achieves this by learning the unknown relevance relation, whereas prior contextual bandit algorithms that do not exploit the existence of a relevance relation will haveÕ(T (D+1)/(D+2) ) regret. Our algorithm alternates between exploring and exploiting, it does not require reward observations in exploitations, and it guarantees with a high probability that actions with suboptimality greater than are never selected in exploitations. Our proposed method can be applied to a variety of learning applications including medical diagnosis, recommender systems, popularity prediction from social networks, network security etc., where at each instance of time vast amounts of different types of information are available to the decision maker, but the effect of an action depends only on a single type.
Taking the Human Out of the Loop: A Review of Bayesian Optimization
"... Big data applications are typically associated with systems involving large numbers of users, massive complex software systems, and largescale heterogeneous computing and storage architectures. The construction of such systems involves many distributed design choices. The end products (e.g., reco ..."
Abstract
 Add to MetaCart
Big data applications are typically associated with systems involving large numbers of users, massive complex software systems, and largescale heterogeneous computing and storage architectures. The construction of such systems involves many distributed design choices. The end products (e.g., recommendation systems, medical analysis tools, realtime game engines, speech recognizers) thus involves many tunable configuration parameters. These parameters are often specified and hardcoded into the software by various developers or teams. If optimized jointly, these parameters can result in significant improvements. Bayesian optimization is a powerful tool for the joint optimization of design choices that is gaining great popularity in recent years. It promises greater automation so as to increase both product quality and human productivity. This review paper introduces Bayesian optimization, highlights some of its methodological aspects, and showcases a wide range of applications.
Scalable Bayesian Optimization Using Deep Neural Networks
"... Bayesian optimization is an effective methodology for the global optimization of functions with expensive evaluations. It relies on querying a distribution over functions defined by a relatively cheap surrogate model. An accurate model for this distribution over functions is critical to the effect ..."
Abstract
 Add to MetaCart
(Show Context)
Bayesian optimization is an effective methodology for the global optimization of functions with expensive evaluations. It relies on querying a distribution over functions defined by a relatively cheap surrogate model. An accurate model for this distribution over functions is critical to the effectiveness of the approach, and is typically fit using Gaussian processes (GPs). However, since GPs scale cubically with the number of observations, it has been challenging to handle objectives whose optimization requires many evaluations, and as such, massively parallelizing the optimization. In this work, we explore the use of neural networks as an alternative to GPs to model distributions over functions. We show that performing adaptive basis function regression with a neural network as the parametric form performs competitively with stateoftheart GPbased approaches, but scales linearly with the number of data rather than cubically. This allows us to achieve a previously intractable degree of parallelism, which we apply to large scale hyperparameter optimization, rapidly finding competitive models on benchmark object recognition tasks using convolutional networks, and image caption generation using neural language models.
4 STOCHASTIC CONTINUUM ARMED BANDIT PROBLEM OF FEW LINEAR PARAMETERS IN HIGH DIMENSIONS
"... ar ..."
(Show Context)
Harvard University
"... Recent work on Bayesian optimization has shown its effectiveness in global optimization of difficult blackbox objective functions. Many realworld optimization problems of interest also have constraints which are unknown a priori. In this paper, we study Bayesian optimization for constrained proble ..."
Abstract
 Add to MetaCart
(Show Context)
Recent work on Bayesian optimization has shown its effectiveness in global optimization of difficult blackbox objective functions. Many realworld optimization problems of interest also have constraints which are unknown a priori. In this paper, we study Bayesian optimization for constrained problems in the general case that noise may be present in the constraint functions, and the objective and constraints may be evaluated independently. We provide motivating practical examples, and present a general framework to solve such problems. We demonstrate the effectiveness of our approach on optimizing the performance of online latent Dirichlet allocation subject to topic sparsity constraints, tuning a neural network given testtime memory constraints, and optimizing Hamiltonian Monte Carlo to achieve maximal effectiveness in a fixed time, subject to passing standard convergence diagnostics. 1
Online Clustering of Bandits Claudio Gentile
"... We introduce a novel algorithmic approach to content recommendation based on adaptive clustering of explorationexploitation (“bandit”) strategies. We provide a sharp regret analysis of this algorithm in a standard stochastic noise setting, demonstrate its scalability properties, and prove its effe ..."
Abstract
 Add to MetaCart
(Show Context)
We introduce a novel algorithmic approach to content recommendation based on adaptive clustering of explorationexploitation (“bandit”) strategies. We provide a sharp regret analysis of this algorithm in a standard stochastic noise setting, demonstrate its scalability properties, and prove its effectiveness on a number of artificial and realworld datasets. Our experiments show a significant increase in prediction performance over stateoftheart methods for bandit problems. 1.
Supplementary Material to “Online Clustering of Bandits” Claudio Gentile
"... This supplementary material contains all proofs and technical details omitted from the main text, along with ancillary comments, discussion about related work, and extra experimental results. 1. Proof of Theorem 1 The following sequence of lemmas are of preliminary importance. The first one needs e ..."
Abstract
 Add to MetaCart
(Show Context)
This supplementary material contains all proofs and technical details omitted from the main text, along with ancillary comments, discussion about related work, and extra experimental results. 1. Proof of Theorem 1 The following sequence of lemmas are of preliminary importance. The first one needs extra variance conditions on the process X generating the context vectors. We find it convenient to introduce the node counterpart to TCBj,t−1(x), and the cluster counterpart to T̃CBi,t−1. Given round t, node i ∈ V, and cluster index j ∈ {1,...,mt}, we let TCBi,t−1(x) = x⊤M−1i,t−1x