Results 1  10
of
102
Active learning literature survey
, 2010
"... The key idea behind active learning is that a machine learning algorithm can achieve greater accuracy with fewer labeled training instances if it is allowed to choose the data from which is learns. An active learner may ask queries in the form of unlabeled instances to be labeled by an oracle (e.g., ..."
Abstract

Cited by 293 (2 self)
 Add to MetaCart
The key idea behind active learning is that a machine learning algorithm can achieve greater accuracy with fewer labeled training instances if it is allowed to choose the data from which is learns. An active learner may ask queries in the form of unlabeled instances to be labeled by an oracle (e.g., a human annotator). Active learning is wellmotivated in many modern machine learning problems, where unlabeled data may be abundant but labels are difficult, timeconsuming, or expensive to obtain. This report provides a general introduction to active learning and a survey of the literature. This includes a discussion of the scenarios in which queries can be formulated, and an overview of the query strategy frameworks proposed in the literature to date. An analysis of the empirical and theoretical evidence for active learning, a summary of several problem setting variants, and a discussion
Agnostic active learning
 In ICML
, 2006
"... We state and analyze the first active learning algorithm which works in the presence of arbitrary forms of noise. The algorithm, A2 (for Agnostic Active), relies only upon the assumption that the samples are drawn i.i.d. from a fixed distribution. We show that A2 achieves an exponential improvement ..."
Abstract

Cited by 183 (15 self)
 Add to MetaCart
(Show Context)
We state and analyze the first active learning algorithm which works in the presence of arbitrary forms of noise. The algorithm, A2 (for Agnostic Active), relies only upon the assumption that the samples are drawn i.i.d. from a fixed distribution. We show that A2 achieves an exponential improvement (i.e., requires only O � ln 1 ɛ samples to find an ɛoptimal classifier) over the usual sample complexity of supervised learning, for several settings considered before in the realizable case. These include learning threshold classifiers and learning homogeneous linear separators with respect to an input distribution which is uniform over the unit sphere. 1.
Coarse sample complexity bounds for active learning
 In Neural Information Processing Systems
, 2005
"... ..."
(Show Context)
A bound on the label complexity of agnostic active learning
 In Proc. of the 24th international conference on Machine learning
, 2007
"... We study the label complexity of poolbased active learning in the agnostic PAC model. Specifically, we derive general bounds on the number of label requests made by the A 2 algorithm proposed by Balcan, Beygelzimer & Langford (Balcan et al., 2006). This represents the first nontrivial generalp ..."
Abstract

Cited by 95 (11 self)
 Add to MetaCart
(Show Context)
We study the label complexity of poolbased active learning in the agnostic PAC model. Specifically, we derive general bounds on the number of label requests made by the A 2 algorithm proposed by Balcan, Beygelzimer & Langford (Balcan et al., 2006). This represents the first nontrivial generalpurpose upperboundonlabelcomplexityintheagnostic PAC model. 1.
Minimax bounds for active learning
 In COLT
, 2007
"... Abstract. This paper aims to shed light on achievable limits in active learning. Using minimax analysis techniques, we study the achievable rates of classification error convergence for broad classes of distributions characterized by decision boundary regularity and noise conditions. The results cle ..."
Abstract

Cited by 85 (10 self)
 Add to MetaCart
(Show Context)
Abstract. This paper aims to shed light on achievable limits in active learning. Using minimax analysis techniques, we study the achievable rates of classification error convergence for broad classes of distributions characterized by decision boundary regularity and noise conditions. The results clearly indicate the conditions under which one can expect significant gains through active learning. Furthermore we show that the learning rates derived are tight for “boundary fragment ” classes in ddimensional feature spaces when the feature marginal density is bounded from above and below. 1
Analysis of perceptronbased active learning
 In COLT
, 2005
"... Abstract. We start by showing that in an active learning setting, the Perceptron algorithm needs \Omega ( 1ffl2) labels to learn linear separators within generalization error ffl. We then present a simple selective sampling algorithm for this problem, which combines a modification of the perceptron ..."
Abstract

Cited by 81 (10 self)
 Add to MetaCart
(Show Context)
Abstract. We start by showing that in an active learning setting, the Perceptron algorithm needs \Omega ( 1ffl2) labels to learn linear separators within generalization error ffl. We then present a simple selective sampling algorithm for this problem, which combines a modification of the perceptron update with an adaptive filtering rule for deciding which points to query. For data distributed uniformly over the unit sphere, we show that our algorithm reaches generalization error ffl after asking for just ~O(d log 1ffl) labels. This exponential improvement over the usual sample complexity of supervised learning has previously been demonstrated only for the computationally more complex querybycommittee algorithm. 1 Introduction In many machine learning applications, unlabeled data is abundant but labelingis expensive. This distinction is not captured in the standard PAC or online models of supervised learning, and has motivated the field of active learning, inwhich the labels of data points are initially hidden, and the learner must pay for each label it wishes revealed. If query points are chosen randomly, the numberof labels needed to reach a target generalization error ffl, at a target confidencelevel 1 ffi, is similar to the sample complexity of supervised learning. The hopeis that there are alternative querying strategies which require significantly fewer
The True Sample Complexity of Active Learning
"... We describe and explore a new perspective on the sample complexity of active learning. In many situations where it was generally believed that active learning does not help, we find that active learning does help in the limit, often with exponential improvements in sample complexity. This contrasts ..."
Abstract

Cited by 63 (16 self)
 Add to MetaCart
(Show Context)
We describe and explore a new perspective on the sample complexity of active learning. In many situations where it was generally believed that active learning does not help, we find that active learning does help in the limit, often with exponential improvements in sample complexity. This contrasts with the traditional analysis of active learning problems such as nonhomogeneous linear separators or depthlimited decision trees, in which Ω(1/ɛ) lower bounds are common; we point out that such results must be interpreted carefully, and that finding an ɛgood classifier can always be accomplished with a number of samples asymptotically smaller than any such bound. These new insights arise from a subtle variation on the traditional definition of sample complexity, not previously recognized in the active learning literature. 1
Adaptive submodularity: Theory and applications in active learning and stochastic optimization
 J. Artificial Intelligence Research
, 2011
"... Many problems in artificial intelligence require adaptively making a sequence of decisions with uncertain outcomes under partial observability. Solving such stochastic optimization problems is a fundamental but notoriously difficult challenge. In this paper, we introduce the concept of adaptive subm ..."
Abstract

Cited by 61 (14 self)
 Add to MetaCart
(Show Context)
Many problems in artificial intelligence require adaptively making a sequence of decisions with uncertain outcomes under partial observability. Solving such stochastic optimization problems is a fundamental but notoriously difficult challenge. In this paper, we introduce the concept of adaptive submodularity, generalizing submodular set functions to adaptive policies. We prove that if a problem satisfies this property, a simple adaptive greedy algorithm is guaranteed to be competitive with the optimal policy. In addition to providing performance guarantees for both stochastic maximization and coverage, adaptive submodularity can be exploited to drastically speed up the greedy algorithm by using lazy evaluations. We illustrate the usefulness of the concept by giving several examples of adaptive submodular objectives arising in diverse AI applications including management of sensing resources, viral marketing and active learning. Proving adaptive submodularity for these problems allows us to recover existing results in these applications as special cases, improve approximation guarantees and handle natural generalizations. 1.
Generalized binary search
 In Proceedings of the 46th Allerton Conference on Communications, Control, and Computing
, 2008
"... This paper addresses the problem of noisy Generalized Binary Search (GBS). GBS is a wellknown greedy algorithm for determining a binaryvalued hypothesis through a sequence of strategically selected queries. At each step, a query is selected that most evenly splits the hypotheses under consideratio ..."
Abstract

Cited by 57 (0 self)
 Add to MetaCart
(Show Context)
This paper addresses the problem of noisy Generalized Binary Search (GBS). GBS is a wellknown greedy algorithm for determining a binaryvalued hypothesis through a sequence of strategically selected queries. At each step, a query is selected that most evenly splits the hypotheses under consideration into two disjoint subsets, a natural generalization of the idea underlying classic binary search. GBS is used in many applications, including fault testing, machine diagnostics, disease diagnosis, job scheduling, image processing, computer vision, and active learning. In most of these cases, the responses to queries can be noisy. Past work has provided a partial characterization of GBS, but existing noisetolerant versions of GBS are suboptimal in terms of query complexity. This paper presents an optimal algorithm for noisy GBS and demonstrates its application to learning multidimensional threshold functions. 1
Margin based active learning
 Proc. of the 20 th Conference on Learning Theory
, 2007
"... Abstract. We present a framework for margin based active learning of linear separators. We instantiate it for a few important cases, some of which have been previously considered in the literature. We analyze the effectiveness of our framework both in the realizable case and in a specific noisy sett ..."
Abstract

Cited by 55 (9 self)
 Add to MetaCart
Abstract. We present a framework for margin based active learning of linear separators. We instantiate it for a few important cases, some of which have been previously considered in the literature. We analyze the effectiveness of our framework both in the realizable case and in a specific noisy setting related to the Tsybakov small noise condition. 1