Results 1 - 10
of
10
Bayesian optimization with unknown constraints
- In International Conference on Artificial Intelligence and Statistics
, 2014
"... Recent work on Bayesian optimization has shown its effectiveness in global optimization of difficult black-box objective functions. Many real-world optimization problems of interest also have constraints which are unknown a priori. In this paper, we study Bayesian optimization for constrained proble ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
(Show Context)
Recent work on Bayesian optimization has shown its effectiveness in global optimization of difficult black-box objective functions. Many real-world optimization problems of interest also have constraints which are unknown a priori. In this paper, we study Bayesian optimization for constrained problems in the general case that noise may be present in the constraint functions, and the objective and constraints may be evaluated independently. We provide motivating prac-tical examples, and present a general framework to solve such problems. We demonstrate the effectiveness of our approach on optimizing the performance of online latent Dirichlet allocation subject to topic sparsity constraints, tuning a neural network given test-time memory constraints, and optimizing Hamiltonian Monte Carlo to achieve maximal effectiveness in a fixed time, subject to passing standard convergence diagnostics. 1
1RELEAF: An Algorithm for Learning and Exploiting Relevance
"... Abstract—Recommender systems, medical diagnosis, network security, etc., require on-going learning and decision-making in real time. These – and many others – represent perfect examples of the opportunities and difficulties presented by Big Data: the available information often arrives from a variet ..."
Abstract
- Add to MetaCart
Abstract—Recommender systems, medical diagnosis, network security, etc., require on-going learning and decision-making in real time. These – and many others – represent perfect examples of the opportunities and difficulties presented by Big Data: the available information often arrives from a variety of sources and has diverse features so that learning from all the sources may be valuable but integrating what is learned is subject to the curse of dimensionality. This paper develops and analyzes algorithms that allow efficient learning and decision-making while avoiding the curse of dimensionality. We formalize the information available to the learner/decision-maker at a particular time as a context vector which the learner should consider when taking actions. In general the context vector is very high dimensional, but in many settings, the most relevant information is embedded into only a few relevant dimensions. If these relevant dimensions were known in advance, the problem would be simple – but they are not. Moreover, the relevant dimensions may be different for different actions. Our algorithm learns the relevant dimensions for each action, and makes decisions based in what it has learned. Formally, we build on the structure of a contextual multi-armed bandit by adding and exploiting a relevance relation. We prove a general regret bound for our algorithm whose time order depends only on the maximum number of relevant dimensions among all the actions, which in the special case where the relevance relation is single-valued (a function), reduces to Õ(T 2( 2−1)); in the absence of a relevance relation, the best known contextual bandit algorithms achieve regret Õ(T (D+1)/(D+2)), where D is the full dimension of the context vector. Our algorithm alternates between exploring and exploiting and does not require observing outcomes during exploitation (so allows for active learning). Moreover, during exploitation, suboptimal actions are chosen with arbitrarily low probability. Our algorithm is tested on datasets arising from network security and online news article recommendations. Index Terms—Contextual bandits, regret, dimensionality re-duction, learning relevance, recommender systems, online learn-ing, active learning. I.
Gains and Losses are Fundamentally Different in Regret Minimization: The Sparse Case
, 2016
"... Abstract We demonstrate that, in the classical non-stochastic regret minimization problem with d decisions, gains and losses to be respectively maximized or minimized are fundamentally different. Indeed, by considering the additional sparsity assumption (at each stage, at most s decisions incur a n ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract We demonstrate that, in the classical non-stochastic regret minimization problem with d decisions, gains and losses to be respectively maximized or minimized are fundamentally different. Indeed, by considering the additional sparsity assumption (at each stage, at most s decisions incur a nonzero outcome), we derive optimal regret bounds of different orders. Specifically, with gains, we obtain an optimal regret guarantee after T stages of order √ T log s, so the classical dependency in the dimension is replaced by the sparsity size. With losses, we provide matching upper and lower bounds of order T s log(d)/d, which is decreasing in d. Eventually, we also study the bandit setting, and obtain an upper bound of order T s log(d/s) when outcomes are losses. This bound is proven to be optimal up to the logarithmic factor log(d/s).
Discovering, Learning and Exploiting Relevance
"... Abstract In this paper we consider the problem of learning online what is the information to consider when making sequential decisions. We formalize this as a contextual multi-armed bandit problem where a high dimensional (D-dimensional) context vector arrives to a learner which needs to select an ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract In this paper we consider the problem of learning online what is the information to consider when making sequential decisions. We formalize this as a contextual multi-armed bandit problem where a high dimensional (D-dimensional) context vector arrives to a learner which needs to select an action to maximize its expected reward at each time step. Each dimension of the context vector is called a type. We assume that there exists an unknown relation between actions and types, called the relevance relation, such that the reward of an action only depends on the contexts of the relevant types. When the relation is a function, i.e., the reward of an action only depends on the context of a single type, and the expected reward of an action is Lipschitz continuous in the context of its relevant type, we propose an algorithm that achievesÕ(T γ ) regret with a high probability, where γ = 2/(1 + √ 2). Our algorithm achieves this by learning the unknown relevance relation, whereas prior contextual bandit algorithms that do not exploit the existence of a relevance relation will haveÕ(T (D+1)/(D+2) ) regret. Our algorithm alternates between exploring and exploiting, it does not require reward observations in exploitations, and it guarantees with a high probability that actions with suboptimality greater than are never selected in exploitations. Our proposed method can be applied to a variety of learning applications including medical diagnosis, recommender systems, popularity prediction from social networks, network security etc., where at each instance of time vast amounts of different types of information are available to the decision maker, but the effect of an action depends only on a single type.
Taking the Human Out of the Loop: A Review of Bayesian Optimization
"... Big data applications are typically associated with systems involving large numbers of users, massive complex software systems, and large-scale heterogeneous computing and storage architectures. The construction of such systems involves many distributed design choices. The end products (e.g., rec-o ..."
Abstract
- Add to MetaCart
Big data applications are typically associated with systems involving large numbers of users, massive complex software systems, and large-scale heterogeneous computing and storage architectures. The construction of such systems involves many distributed design choices. The end products (e.g., rec-ommendation systems, medical analysis tools, real-time game engines, speech recognizers) thus involves many tunable config-uration parameters. These parameters are often specified and hard-coded into the software by various developers or teams. If optimized jointly, these parameters can result in significant improvements. Bayesian optimization is a powerful tool for the joint optimization of design choices that is gaining great popularity in recent years. It promises greater automation so as to increase both product quality and human productivity. This review paper introduces Bayesian optimization, highlights some of its methodological aspects, and showcases a wide range of applications.
Scalable Bayesian Optimization Using Deep Neural Networks
"... Bayesian optimization is an effective methodol-ogy for the global optimization of functions with expensive evaluations. It relies on querying a dis-tribution over functions defined by a relatively cheap surrogate model. An accurate model for this distribution over functions is critical to the effect ..."
Abstract
- Add to MetaCart
(Show Context)
Bayesian optimization is an effective methodol-ogy for the global optimization of functions with expensive evaluations. It relies on querying a dis-tribution over functions defined by a relatively cheap surrogate model. An accurate model for this distribution over functions is critical to the effectiveness of the approach, and is typically fit using Gaussian processes (GPs). However, since GPs scale cubically with the number of obser-vations, it has been challenging to handle objec-tives whose optimization requires many evalua-tions, and as such, massively parallelizing the op-timization. In this work, we explore the use of neural net-works as an alternative to GPs to model dis-tributions over functions. We show that per-forming adaptive basis function regression with a neural network as the parametric form performs competitively with state-of-the-art GP-based ap-proaches, but scales linearly with the number of data rather than cubically. This allows us to achieve a previously intractable degree of paral-lelism, which we apply to large scale hyperpa-rameter optimization, rapidly finding competitive models on benchmark object recognition tasks using convolutional networks, and image caption generation using neural language models.
4 STOCHASTIC CONTINUUM ARMED BANDIT PROBLEM OF FEW LINEAR PARAMETERS IN HIGH DIMENSIONS
"... ar ..."
(Show Context)
Harvard University
"... Recent work on Bayesian optimization has shown its effectiveness in global optimization of difficult black-box objective functions. Many real-world optimization problems of interest also have constraints which are unknown a priori. In this paper, we study Bayesian optimization for constrained proble ..."
Abstract
- Add to MetaCart
(Show Context)
Recent work on Bayesian optimization has shown its effectiveness in global optimization of difficult black-box objective functions. Many real-world optimization problems of interest also have constraints which are unknown a priori. In this paper, we study Bayesian optimization for constrained problems in the general case that noise may be present in the constraint func-tions, and the objective and constraints may be evaluated independently. We provide motivating practical examples, and present a general frame-work to solve such problems. We demonstrate the effectiveness of our approach on optimizing the performance of online latent Dirichlet allo-cation subject to topic sparsity constraints, tun-ing a neural network given test-time memory constraints, and optimizing Hamiltonian Monte Carlo to achieve maximal effectiveness in a fixed time, subject to passing standard convergence di-agnostics. 1
Online Clustering of Bandits Claudio Gentile
"... We introduce a novel algorithmic approach to content recommendation based on adaptive clustering of exploration-exploitation (“bandit”) strategies. We provide a sharp regret analysis of this algorithm in a standard stochastic noise set-ting, demonstrate its scalability properties, and prove its effe ..."
Abstract
- Add to MetaCart
(Show Context)
We introduce a novel algorithmic approach to content recommendation based on adaptive clustering of exploration-exploitation (“bandit”) strategies. We provide a sharp regret analysis of this algorithm in a standard stochastic noise set-ting, demonstrate its scalability properties, and prove its effectiveness on a number of artificial and real-world datasets. Our experiments show a significant increase in prediction performance over state-of-the-art methods for bandit prob-lems. 1.
Supplementary Material to “Online Clustering of Bandits” Claudio Gentile
"... This supplementary material contains all proofs and technical details omitted from the main text, along with ancillary comments, discussion about related work, and extra experimental results. 1. Proof of Theorem 1 The following sequence of lemmas are of preliminary im-portance. The first one needs e ..."
Abstract
- Add to MetaCart
(Show Context)
This supplementary material contains all proofs and technical details omitted from the main text, along with ancillary comments, discussion about related work, and extra experimental results. 1. Proof of Theorem 1 The following sequence of lemmas are of preliminary im-portance. The first one needs extra variance conditions on the process X generating the context vectors. We find it convenient to introduce the node counterpart to TCBj,t−1(x), and the cluster counterpart to T̃CBi,t−1. Given round t, node i ∈ V, and cluster index j ∈ {1,...,mt}, we let TCBi,t−1(x) = x⊤M−1i,t−1x