Results 1  10
of
20
AutoWEKA: Combined Selection and Hyperparameter Optimization of Classification Algorithms
"... Many different machine learning algorithms exist; taking into account each algorithm’s hyperparameters, there is a staggeringly large number of possible alternatives overall. We consider the problem of simultaneously selecting a learning algorithm and setting its hyperparameters, going beyond previo ..."
Abstract

Cited by 32 (8 self)
 Add to MetaCart
(Show Context)
Many different machine learning algorithms exist; taking into account each algorithm’s hyperparameters, there is a staggeringly large number of possible alternatives overall. We consider the problem of simultaneously selecting a learning algorithm and setting its hyperparameters, going beyond previous work that attacks these issues separately. We show that this problem can be addressed by a fully automated approach, leveraging recent innovations in Bayesian optimization. Specifically, we consider a wide range of feature selection techniques (combining 3 search and 8 evaluator methods) and all classification approaches implemented in WEKA’s standard distribution, spanning 2 ensemble methods, 10 metamethods, 27 base classifiers, and hyperparameter settings for each classifier. On each of 21 popular datasets from the UCI repository, the KDD Cup 09, variants of the MNIST dataset and CIFAR10, we show classification performance often much better than using standard selection and hyperparameter optimization methods. We hope that our approach will help nonexpert users to more effectively identify machine learning algorithms and hyperparameter settings appropriate to their applications, and hence to achieve improved performance.
Nearoptimal Batch Mode Active Learning and Adaptive Submodular Optimization
"... Active learning can lead to a dramatic reduction in labeling effort. However, in many practical implementations (such as crowdsourcing, surveys, highthroughput experimental design), it is preferable to query labels for batches of examples to be labelled in parallel. While several heuristics have be ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
Active learning can lead to a dramatic reduction in labeling effort. However, in many practical implementations (such as crowdsourcing, surveys, highthroughput experimental design), it is preferable to query labels for batches of examples to be labelled in parallel. While several heuristics have been proposed for batchmode active learning, little is known about their theoretical performance. We consider batch mode active learning and more general informationparallel stochastic optimization problems that exhibit adaptive submodularity, a natural diminishing returns condition. We prove that for such problems, a simple greedy strategy is competitive with the optimal batchmode policy. In some cases, surprisingly, the use of batches incurs competitively low cost, even when compared to a fully sequential strategy. We demonstrate the effectiveness of our approach on batchmode active learning tasks, where it outperforms the state of the art, as well as the novel problem of multistage influence maximization in social networks. 1.
Spectral bandits for smooth graph functions
 in Proc. Intern. Conf. Mach. Learning (ICML
, 2014
"... Abstract Smooth functions on graphs have wide applications in manifold and semisupervised learning. In this paper, we study a bandit problem where the payoffs of arms are smooth on a graph. This framework is suitable for solving online learning problems that involve graphs, such as contentbased re ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
(Show Context)
Abstract Smooth functions on graphs have wide applications in manifold and semisupervised learning. In this paper, we study a bandit problem where the payoffs of arms are smooth on a graph. This framework is suitable for solving online learning problems that involve graphs, such as contentbased recommendation. In this problem, each item we can recommend is a node and its expected rating is similar to its neighbors. The goal is to recommend items that have high expected ratings. We aim for the algorithms where the cumulative regret with respect to the optimal policy would not scale poorly with the number of nodes. In particular, we introduce the notion of an effective dimension, which is small in realworld graphs, and propose two algorithms for solving our problem that scale linearly and sublinearly in this dimension. Our experiments on realworld content recommendation problem show that a good estimator of user preferences for thousands of items can be learned from just tens of nodes evaluations.
Online Learning under Delayed Feedback
"... Online learning with delayed feedback has received increasing attention recently due to its several applications in distributed, webbased learning problems. In this paper we provide a systematic study of the topic, and analyze the effect of delay on the regret of online learning algorithms. Somewha ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
Online learning with delayed feedback has received increasing attention recently due to its several applications in distributed, webbased learning problems. In this paper we provide a systematic study of the topic, and analyze the effect of delay on the regret of online learning algorithms. Somewhat surprisingly, it turns out that delay increases the regret in a multiplicative way in adversarial problems, and in an additive way in stochastic problems. We give metaalgorithms that transform, in a blackbox fashion, algorithms developed for the nondelayed case into ones that can handle the presence of delays in the feedback loop. Modifications of the wellknown UCB algorithm are also developed for the bandit problem with delayed feedback, with the advantage over the metaalgorithms that they can be implemented with lower complexity. 1.
Parallel Gaussian process optimization with upper confidence bound and pure exploration
 In Machine Learning and Knowledge Discovery in Databases
, 2013
"... Abstract. In this paper, we consider the challenge of maximizing an unknown function f for which evaluations are noisy and are acquired with high cost. An iterative procedure uses the previous measures to actively select the next estimation of f which is predicted to be the most useful. We focus on ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
(Show Context)
Abstract. In this paper, we consider the challenge of maximizing an unknown function f for which evaluations are noisy and are acquired with high cost. An iterative procedure uses the previous measures to actively select the next estimation of f which is predicted to be the most useful. We focus on the case where the function can be evaluated in parallel with batches of fixed size and analyze the benefit compared to the purely sequential procedure in terms of cumulative regret. We introduce the Gaussian Process Upper Confidence Bound and Pure Exploration algorithm (GPUCBPE) which combines the UCB strategy and Pure Exploration in the same batch of evaluations along the parallel iterations. We prove theoretical upper bounds on the regret with batches of size K for this procedure which show the improvement of the order of K for fixed iteration cost over purely sequential versions. Moreover, the multiplicative constants involved have the property of being dimensionfree. We also confirm empirically the efficiency of GPUCBPE on real and synthetic problems compared to stateoftheart competitors. 1
MULTIARMED BANDIT PROBLEMS UNDER DELAYED FEEDBACK
, 2012
"... In this thesis, the multiarmed bandit (MAB) problem in online learning is studied, when the feedback information is not observed immediately but rather after arbitrary, unknown, random delays. In the “stochastic” setting when the rewards come from a fixed distribution, an algorithm is given that ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
In this thesis, the multiarmed bandit (MAB) problem in online learning is studied, when the feedback information is not observed immediately but rather after arbitrary, unknown, random delays. In the “stochastic” setting when the rewards come from a fixed distribution, an algorithm is given that uses a nondelayed MAB algorithm as a blackbox. We also give a method to generalize the theoretical guarantees of nondelayed UCBtype algorithms to the delayed stochastic setting. Assuming the delays are independent of the rewards, we upper bound the penalty in the performance of these algorithms (measured by “regret”) by an additive term depending on the delays. When the rewards are chosen in an adversarial manner, we give a blackbox style algorithm using multiple instances
Offpolicy reinforcement learning with gaussian processes. Acta Automatica Sinica
, 2014
"... An offpolicy Bayesian nonparameteric approximate reinforcement learning framework, termed as GPQ, that employs a Gaussian Processes (GP) model of the value (Q) function is presented in both the batch and online settings. Sufficient conditions on GP hyperparameter selection are established to guara ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
An offpolicy Bayesian nonparameteric approximate reinforcement learning framework, termed as GPQ, that employs a Gaussian Processes (GP) model of the value (Q) function is presented in both the batch and online settings. Sufficient conditions on GP hyperparameter selection are established to guarantee convergence of offpolicy GPQ in the batch setting, and theoretical and practical extensions are provided for the online case. Empirical results demonstrate GPQ has competitive learning speeds in addition to its convergence guarantees and its ability to automatically choose its own bases locations. 1
Computationally efficient Gaussian process changepoint detection and regression
, 2014
"... Most existing GP regression algorithms assume a single generative model, leading to poor performance when data are nonstationary, i.e. generated from multiple switching processes. Existing methods for GP regression over nonstationary data include clustering and changepoint detection algorithms. Ho ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Most existing GP regression algorithms assume a single generative model, leading to poor performance when data are nonstationary, i.e. generated from multiple switching processes. Existing methods for GP regression over nonstationary data include clustering and changepoint detection algorithms. However, these methods require significant computation, do not come with provable guarantees on correctness and speed, and most algorithms only work in batch settings. This thesis presents an efficient online GP framework, GPNBC, that leverages the generalized likelihood ratio test to detect changepoints and learn multiple Gaussian Process models from streaming data. Furthermore, GPNBC can quickly recognize and reuse previously seen models. The algorithm is shown to be theoretically sample efficient in terms of limiting mistaken predictions. Our empirical results on two realworld datasets and one synthetic dataset show GPNBC outperforms state of the art methods for nonstationary regression in terms of regression error and computational efficiency. The second part of the thesis introduces a Reinforcement Learning (RL) algorithm, UCRLGPCPD, for multitask Reinforcement Learning when the reward function is non
Gaussian process optimization with mutual information. arXiv preprint arXiv:1311.4825
, 2013
"... In this paper, we analyze a generic algorithm scheme for sequential global optimization using Gaussian processes. The upper bounds we derive on the cumulative regret for this generic algorithm improve by an exponential factor the previously known bounds for algorithms like GPUCB. We also introduce ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
In this paper, we analyze a generic algorithm scheme for sequential global optimization using Gaussian processes. The upper bounds we derive on the cumulative regret for this generic algorithm improve by an exponential factor the previously known bounds for algorithms like GPUCB. We also introduce the novel Gaussian Process Mutual Information algorithm (GPMI), which significantly improves further these upper bounds for the cumulative regret. We confirm the efficiency of this algorithm on synthetic and real tasks against the natural competitor, GPUCB, and also the Expected Improvement heuristic. 1
Taking the Human Out of the Loop: A Review of Bayesian Optimization
"... Big data applications are typically associated with systems involving large numbers of users, massive complex software systems, and largescale heterogeneous computing and storage architectures. The construction of such systems involves many distributed design choices. The end products (e.g., reco ..."
Abstract
 Add to MetaCart
Big data applications are typically associated with systems involving large numbers of users, massive complex software systems, and largescale heterogeneous computing and storage architectures. The construction of such systems involves many distributed design choices. The end products (e.g., recommendation systems, medical analysis tools, realtime game engines, speech recognizers) thus involves many tunable configuration parameters. These parameters are often specified and hardcoded into the software by various developers or teams. If optimized jointly, these parameters can result in significant improvements. Bayesian optimization is a powerful tool for the joint optimization of design choices that is gaining great popularity in recent years. It promises greater automation so as to increase both product quality and human productivity. This review paper introduces Bayesian optimization, highlights some of its methodological aspects, and showcases a wide range of applications.