Results 1 - 10
of
10
Adaptive Submodularity: A New Approach to Active Learning and Stochastic Optimization
"... Solving stochastic optimization problems under partial observability, where one needs to adaptively make decisions with uncertain outcomes, is a fundamental but notoriously difficult challenge. In this paper, we introduce the concept of adaptive submodularity, generalizing submodular set functions t ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Solving stochastic optimization problems under partial observability, where one needs to adaptively make decisions with uncertain outcomes, is a fundamental but notoriously difficult challenge. In this paper, we introduce the concept of adaptive submodularity, generalizing submodular set functions to adaptive policies. We prove that if a problem satisfies this property, a simple adaptive greedy algorithm is guaranteed to be competitive with the optimal policy. We illustrate the usefulness of the concept by giving several examples of adaptive submodular objectives arising in diverse applications including sensor placement, viral marketing and pool-based active learning. Proving adaptive submodularity for these problems allows us to recover existing results in these applications as special cases and leads to natural generalizations. 1
Lower Bounds for Passive and Active Learning
"... We develop unified information-theoretic machinery for deriving lower bounds for passive and active learning schemes. Our bounds involve the so-called Alexander’s capacity function. The supremum of this function has been recently rediscovered by Hanneke in the context of active learning under the na ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We develop unified information-theoretic machinery for deriving lower bounds for passive and active learning schemes. Our bounds involve the so-called Alexander’s capacity function. The supremum of this function has been recently rediscovered by Hanneke in the context of active learning under the name of “disagreement coefficient. ” For passive learning, our lower bounds match the upper bounds of Giné and Koltchinskii up to constants and generalize analogous results of Massart and Nédélec. For active learning, we provide first known lower bounds based on the capacity function rather than the disagreement coefficient. 1
Active Boosted Learning (ActBoost)
"... Active learning deals with the problem of selecting a small subset of examples to label, from a pool of unlabeled data, for training a good classifier. We develop an active learning algorithm in the boosting framework. In contrast to much of the recent efforts, which has focused on selecting the mos ..."
Abstract
- Add to MetaCart
Active learning deals with the problem of selecting a small subset of examples to label, from a pool of unlabeled data, for training a good classifier. We develop an active learning algorithm in the boosting framework. In contrast to much of the recent efforts, which has focused on selecting the most ambiguous unlabeled example to label based on the current learned classifier, our algorithm selects examples to maximally reduce the volume of the version space of feasible boosted classifiers. We show that under suitable sparsity assumptions, this strategy achieves the generalization error performance of a boosted classifier trained on the entire data set while only selecting logarithmically many unlabeled samples to label. We also establish a partial negative result, in that with out imposing structural assumptions it is difficult to guarantee generalization error performance. We explicitly characterize our convergence rate in terms of the sign pattern differences produced by the weak learners on the unlabeled data. We also present a convex relaxation to account for the non-convex sparse structure and show that the computational complexity of the resulting algorithm scales polynomially in the number of weak learners. We test ActBoost on several datasets to illustrate its performance and demonstrate its robustness to initialization. 1
Active Diagnosis under Persistent Noise with Unknown Noise Distribution: A Rank-Based Approach
"... We consider a problem of active diagnosis, where the goal is to efficiently identify an unknown object by sequentially selecting, and observing, the responses to binary valued queries. We assume that query observations are noisy, and further that the noise is persistent, meaning that repeating a que ..."
Abstract
- Add to MetaCart
We consider a problem of active diagnosis, where the goal is to efficiently identify an unknown object by sequentially selecting, and observing, the responses to binary valued queries. We assume that query observations are noisy, and further that the noise is persistent, meaning that repeating a query does not change the response. Previous work in this area either assumed the knowledge of the query noise distribution, or that the noise level is sufficiently low so that the unknown object can be identified with high accuracy. We make no such assumptions, and introduce an algorithm that returns a ranked list of objects, such that the expected rank of the true object is optimized. Furthermore, our algorithm does not require knowledge of the query noise distribution. 1
Near-Optimal Bayesian . . .
"... We tackle the fundamental problem of Bayesian active learning with noise, where we need to adaptively select from a number of expensive tests in order to identify an unknown hypothesis sampled from a known prior distribution. In the case of noise–free observations, a greedy algorithm called generali ..."
Abstract
- Add to MetaCart
We tackle the fundamental problem of Bayesian active learning with noise, where we need to adaptively select from a number of expensive tests in order to identify an unknown hypothesis sampled from a known prior distribution. In the case of noise–free observations, a greedy algorithm called generalized binary search (GBS) is known to perform near–optimally. We show that if the observations are noisy, perhaps surprisingly, GBS can perform very poorly. We develop EC 2, a novel, greedy active learning algorithm and prove that it is competitive with the optimal policy, thus obtaining the first competitiveness guarantees for Bayesian active learning with noisy observations. Our bounds rely on a recently discovered diminishing returns property called adaptive submodularity, generalizing the classical notion of submodular set functions to adaptive policies. Our results hold even if the tests have non–uniform cost and their noise is correlated. We also propose EFFECX-TIVE, a particularly fast approximation of EC 2, and evaluate it on a Bayesian experimental design problem involving human subjects, intended to tease apart competing economic theories of how people make decisions under uncertainty.
Extensions of Generalized Binary Search to Group . . .
"... Generalized Binary Search (GBS) is a well known greedy algorithm for identifying an unknown object while minimizing the number of “yes ” or “no ” questions posed about that object, and arises in problems such as active learning and active diagnosis. Here, we provide a coding-theoretic interpretation ..."
Abstract
- Add to MetaCart
Generalized Binary Search (GBS) is a well known greedy algorithm for identifying an unknown object while minimizing the number of “yes ” or “no ” questions posed about that object, and arises in problems such as active learning and active diagnosis. Here, we provide a coding-theoretic interpretation for GBS and show that GBS can be viewed as a top-down algorithm that greedily minimizes the expected number of queries required to identify an object. This interpretation is then used to extend GBS in two ways. First, we consider the case where the objects are partitioned into groups, and the objective is to identify only the group to which the object belongs. Then, we consider the case where the cost of identifying an object grows exponentially in the number of queries. In each case, we present an exact formula for the objective function involving Shannon or Rényi entropy, and develop a greedy algorithm for minimizing it.
Bandits, Query Learning, and the Haystack Dimension
"... Motivated by multi-armed bandits (MAB) problems with a very large or even infinite number of arms, we consider the problem of finding a maximum of an unknown target function by querying the function at chosen inputs (or arms). We give an analysis of the query complexity of this problem, under the as ..."
Abstract
- Add to MetaCart
Motivated by multi-armed bandits (MAB) problems with a very large or even infinite number of arms, we consider the problem of finding a maximum of an unknown target function by querying the function at chosen inputs (or arms). We give an analysis of the query complexity of this problem, under the assumption that the payoff of each arm is given by a function belonging to a known, finite, but otherwise arbitrary function class. Our analysis centers on a new notion of function class complexity that we call the haystack dimension, which is used to prove the approximate optimality of a simple greedy algorithm. This algorithm is then used as a subroutine in a functional MAB algorithm, yielding provably near-optimal regret. We provide a generalization to the infinite cardinality setting, and comment on how our analysis is connected to, and improves upon, existing results for query learning and generalized binary search. 1
OASIS: Online Active SemI-Supervised Learning
"... We consider a learning setting of importance to large scale machine learning: potentially unlimited data arrives sequentially, but only a small fraction of it is labeled. The learner cannot store the data; it should learn from both labeled and unlabeled data, and it may also request labels for some ..."
Abstract
- Add to MetaCart
We consider a learning setting of importance to large scale machine learning: potentially unlimited data arrives sequentially, but only a small fraction of it is labeled. The learner cannot store the data; it should learn from both labeled and unlabeled data, and it may also request labels for some of the unlabeled items. This setting is frequently encountered in real-world applications and has the characteristics of online, semi-supervised, and active learning. Yet previous learning models fail to consider these characteristics jointly. We present OASIS, a Bayesian model for this learning setting. The main contributions of the model include the novel integration of a semi-supervised likelihood function, a sequential Monte Carlo scheme for efficient online Bayesian updating, and a posterior-reduction criterion for active learning. Encouraging results on both synthetic and real-world optical character recognition data demonstrate the synergy of these characteristics in OASIS.
24th Annual Conference on Learning Theory Bandits, Query Learning, and the Haystack Dimension
"... Motivated by multi-armed bandits (MAB) problems with a very large or even infinite number of arms, we consider the problem of finding a maximum of an unknown target function by querying the function at chosen inputs (or arms). We give an analysis of the query complexity of this problem, under the as ..."
Abstract
- Add to MetaCart
Motivated by multi-armed bandits (MAB) problems with a very large or even infinite number of arms, we consider the problem of finding a maximum of an unknown target function by querying the function at chosen inputs (or arms). We give an analysis of the query complexity of this problem, under the assumption that the payoff of each arm is given by a function belonging to a known, finite, but otherwise arbitrary function class. Our analysis centers on a new notion of function class complexity that we call the haystack dimension, which is used to prove the approximate optimality of a simple greedy algorithm. This algorithm is then used as a subroutine in a functional MAB algorithm, yielding provably near-optimal regret. We provide a generalization to the infinite cardinality setting, and comment on how our analysis is connected to, and improves upon, existing results for query learning and generalized binary search. Keywords: Multi-armed bandits, learning theory 1.
Efficient Touch Based Localization through Submodularity
"... Abstract — Many robotic systems deal with uncertainty by performing a sequence of information gathering actions. In this work, we focus on the problem of efficiently constructing such a sequence by drawing an explicit connection to submodularity. Ideally, we would like a method that finds the optima ..."
Abstract
- Add to MetaCart
Abstract — Many robotic systems deal with uncertainty by performing a sequence of information gathering actions. In this work, we focus on the problem of efficiently constructing such a sequence by drawing an explicit connection to submodularity. Ideally, we would like a method that finds the optimal sequence of actions, taking the minimum amount of time while providing sufficient information. Finding this sequence, however, is generally intractable. As a result, many well-established methods select actions greedily. Surprisingly, this often performs well even with only one step lookahead. Our work first explains this high performance – we note that a commonly used metric, reduction of Shannon entropy, is submodular under certain assumptions, rendering the greedy solution comparable to the optimal plan in the offline setting. Recently developed notions of adaptive submodularity enable guarantees for a greedy algorithm in the online setting. We develop new methods within this framework, enabling us to provide guarantees compared to the optimal online policy, as well as exploit additional computational speedups. We demonstrate the effectiveness of these methods in simulation and on a robot. I.

