Results 1 - 10
of
20
Auto-WEKA: Combined Selection and Hyperparameter Optimization of Classification Algorithms
"... Many different machine learning algorithms exist; taking into account each algorithm’s hyperparameters, there is a staggeringly large number of possible alternatives overall. We consider the problem of simultaneously selecting a learning algorithm and setting its hyperparameters, going beyond previo ..."
Abstract
-
Cited by 32 (8 self)
- Add to MetaCart
(Show Context)
Many different machine learning algorithms exist; taking into account each algorithm’s hyperparameters, there is a staggeringly large number of possible alternatives overall. We consider the problem of simultaneously selecting a learning algorithm and setting its hyperparameters, going beyond previous work that attacks these issues separately. We show that this problem can be addressed by a fully automated approach, leveraging recent innovations in Bayesian optimization. Specifically, we consider a wide range of feature selection techniques (combining 3 search and 8 evaluator methods) and all classification approaches implemented in WEKA’s standard distribution, spanning 2 ensemble methods, 10 meta-methods, 27 base classifiers, and hyperparameter settings for each classifier. On each of 21 popular datasets from the UCI repository, the KDD Cup 09, variants of the MNIST dataset and CIFAR-10, we show classification performance often much better than using standard selection and hyperparameter optimization methods. We hope that our approach will help non-expert users to more effectively identify machine learning algorithms and hyperparameter settings appropriate to their applications, and hence to achieve improved performance.
Near-optimal Batch Mode Active Learning and Adaptive Submodular Optimization
"... Active learning can lead to a dramatic reduction in labeling effort. However, in many practical implementations (such as crowdsourcing, surveys, high-throughput experimental design), it is preferable to query labels for batches of examples to be labelled in parallel. While several heuristics have be ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
Active learning can lead to a dramatic reduction in labeling effort. However, in many practical implementations (such as crowdsourcing, surveys, high-throughput experimental design), it is preferable to query labels for batches of examples to be labelled in parallel. While several heuristics have been proposed for batch-mode active learning, little is known about their theoretical performance. We consider batch mode active learning and more general information-parallel stochastic optimization problems that exhibit adaptive submodularity, a natural diminishing returns condition. We prove that for such problems, a simple greedy strategy is competitive with the optimal batch-mode policy. In some cases, surprisingly, the use of batches incurs competitively low cost, even when compared to a fully sequential strategy. We demonstrate the effectiveness of our approach on batch-mode active learning tasks, where it outperforms the state of the art, as well as the novel problem of multi-stage influence maximization in social networks. 1.
Spectral bandits for smooth graph functions
- in Proc. Intern. Conf. Mach. Learning (ICML
, 2014
"... Abstract Smooth functions on graphs have wide applications in manifold and semi-supervised learning. In this paper, we study a bandit problem where the payoffs of arms are smooth on a graph. This framework is suitable for solving online learning problems that involve graphs, such as contentbased re ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
(Show Context)
Abstract Smooth functions on graphs have wide applications in manifold and semi-supervised learning. In this paper, we study a bandit problem where the payoffs of arms are smooth on a graph. This framework is suitable for solving online learning problems that involve graphs, such as contentbased recommendation. In this problem, each item we can recommend is a node and its expected rating is similar to its neighbors. The goal is to recommend items that have high expected ratings. We aim for the algorithms where the cumulative regret with respect to the optimal policy would not scale poorly with the number of nodes. In particular, we introduce the notion of an effective dimension, which is small in real-world graphs, and propose two algorithms for solving our problem that scale linearly and sublinearly in this dimension. Our experiments on real-world content recommendation problem show that a good estimator of user preferences for thousands of items can be learned from just tens of nodes evaluations.
Online Learning under Delayed Feedback
"... Online learning with delayed feedback has received increasing attention recently due to its several applications in distributed, web-based learning problems. In this paper we provide a systematic study of the topic, and analyze the effect of delay on the regret of online learning algorithms. Somewha ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
(Show Context)
Online learning with delayed feedback has received increasing attention recently due to its several applications in distributed, web-based learning problems. In this paper we provide a systematic study of the topic, and analyze the effect of delay on the regret of online learning algorithms. Somewhat surprisingly, it turns out that delay increases the regret in a multiplicative way in adversarial problems, and in an additive way in stochastic problems. We give meta-algorithms that transform, in a black-box fashion, algorithms developed for the non-delayed case into ones that can handle the presence of delays in the feedback loop. Modifications of the well-known UCB algorithm are also developed for the bandit problem with delayed feedback, with the advantage over the meta-algorithms that they can be implemented with lower complexity. 1.
Parallel Gaussian process optimization with upper confidence bound and pure exploration
- In Machine Learning and Knowledge Discovery in Databases
, 2013
"... Abstract. In this paper, we consider the challenge of maximizing an unknown function f for which evaluations are noisy and are acquired with high cost. An iterative procedure uses the previous measures to ac-tively select the next estimation of f which is predicted to be the most useful. We focus on ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
(Show Context)
Abstract. In this paper, we consider the challenge of maximizing an unknown function f for which evaluations are noisy and are acquired with high cost. An iterative procedure uses the previous measures to ac-tively select the next estimation of f which is predicted to be the most useful. We focus on the case where the function can be evaluated in par-allel with batches of fixed size and analyze the benefit compared to the purely sequential procedure in terms of cumulative regret. We introduce the Gaussian Process Upper Confidence Bound and Pure Exploration algorithm (GP-UCB-PE) which combines the UCB strategy and Pure Ex-ploration in the same batch of evaluations along the parallel iterations. We prove theoretical upper bounds on the regret with batches of size K for this procedure which show the improvement of the order of K for fixed iteration cost over purely sequential versions. Moreover, the mul-tiplicative constants involved have the property of being dimension-free. We also confirm empirically the efficiency of GP-UCB-PE on real and synthetic problems compared to state-of-the-art competitors. 1
MULTI-ARMED BANDIT PROBLEMS UNDER DELAYED FEEDBACK
, 2012
"... In this thesis, the multi-armed bandit (MAB) problem in online learning is studied, when the feed-back information is not observed immediately but rather after arbitrary, unknown, random delays. In the “stochastic” setting when the rewards come from a fixed distribution, an algorithm is given that ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
In this thesis, the multi-armed bandit (MAB) problem in online learning is studied, when the feed-back information is not observed immediately but rather after arbitrary, unknown, random delays. In the “stochastic” setting when the rewards come from a fixed distribution, an algorithm is given that uses a non-delayed MAB algorithm as a black-box. We also give a method to generalize the theoretical guarantees of non-delayed UCB-type algorithms to the delayed stochastic setting. Assuming the delays are independent of the rewards, we upper bound the penalty in the performance of these algorithms (measured by “regret”) by an additive term depending on the delays. When the rewards are chosen in an adversarial manner, we give a black-box style algorithm using multiple instances
Off-policy reinforcement learning with gaussian processes. Acta Automatica Sinica
, 2014
"... An off-policy Bayesian nonparameteric approximate reinforcement learning framework, termed as GPQ, that employs a Gaussian Processes (GP) model of the value (Q) function is presented in both the batch and online settings. Suf-ficient conditions on GP hyperparameter selection are established to guara ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
(Show Context)
An off-policy Bayesian nonparameteric approximate reinforcement learning framework, termed as GPQ, that employs a Gaussian Processes (GP) model of the value (Q) function is presented in both the batch and online settings. Suf-ficient conditions on GP hyperparameter selection are established to guarantee convergence of off-policy GPQ in the batch setting, and theoretical and practical extensions are provided for the online case. Empirical results demonstrate GPQ has competitive learning speeds in addition to its convergence guarantees and its ability to automatically choose its own bases locations. 1
Computationally efficient Gaussian process changepoint detection and regression
, 2014
"... Most existing GP regression algorithms assume a single generative model, leading to poor performance when data are nonstationary, i.e. generated from multiple switching processes. Existing methods for GP regression over non-stationary data include clustering and change-point detection algorithms. Ho ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Most existing GP regression algorithms assume a single generative model, leading to poor performance when data are nonstationary, i.e. generated from multiple switching processes. Existing methods for GP regression over non-stationary data include clustering and change-point detection algorithms. However, these methods require significant computation, do not come with provable guarantees on correctness and speed, and most algorithms only work in batch settings. This thesis presents an efficient online GP framework, GP-NBC, that leverages the generalized likelihood ratio test to detect changepoints and learn multiple Gaussian Process models from streaming data. Furthermore, GP-NBC can quickly recog-nize and reuse previously seen models. The algorithm is shown to be theoretically sample efficient in terms of limiting mistaken predictions. Our empirical results on two real-world datasets and one synthetic dataset show GP-NBC outperforms state of the art methods for nonstationary regression in terms of regression error and computational efficiency. The second part of the thesis introduces a Reinforcement Learning (RL) algorithm, UCRL-GP-CPD, for multi-task Reinforcement Learning when the reward function is non-
Gaussian process optimization with mutual information. arXiv preprint arXiv:1311.4825
, 2013
"... In this paper, we analyze a generic algorithm scheme for sequential global opti-mization using Gaussian processes. The upper bounds we derive on the cumulative regret for this generic algorithm improve by an exponential factor the previously known bounds for algorithms like GP-UCB. We also introduce ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In this paper, we analyze a generic algorithm scheme for sequential global opti-mization using Gaussian processes. The upper bounds we derive on the cumulative regret for this generic algorithm improve by an exponential factor the previously known bounds for algorithms like GP-UCB. We also introduce the novel Gaussian Process Mutual Information algorithm (GP-MI), which significantly improves fur-ther these upper bounds for the cumulative regret. We confirm the efficiency of this algorithm on synthetic and real tasks against the natural competitor, GP-UCB, and also the Expected Improvement heuristic. 1
Taking the Human Out of the Loop: A Review of Bayesian Optimization
"... Big data applications are typically associated with systems involving large numbers of users, massive complex software systems, and large-scale heterogeneous computing and storage architectures. The construction of such systems involves many distributed design choices. The end products (e.g., rec-o ..."
Abstract
- Add to MetaCart
Big data applications are typically associated with systems involving large numbers of users, massive complex software systems, and large-scale heterogeneous computing and storage architectures. The construction of such systems involves many distributed design choices. The end products (e.g., rec-ommendation systems, medical analysis tools, real-time game engines, speech recognizers) thus involves many tunable config-uration parameters. These parameters are often specified and hard-coded into the software by various developers or teams. If optimized jointly, these parameters can result in significant improvements. Bayesian optimization is a powerful tool for the joint optimization of design choices that is gaining great popularity in recent years. It promises greater automation so as to increase both product quality and human productivity. This review paper introduces Bayesian optimization, highlights some of its methodological aspects, and showcases a wide range of applications.