Results 1  10
of
69
Active learning literature survey
, 2010
"... The key idea behind active learning is that a machine learning algorithm can achieve greater accuracy with fewer labeled training instances if it is allowed to choose the data from which is learns. An active learner may ask queries in the form of unlabeled instances to be labeled by an oracle (e.g., ..."
Abstract

Cited by 132 (1 self)
 Add to MetaCart
The key idea behind active learning is that a machine learning algorithm can achieve greater accuracy with fewer labeled training instances if it is allowed to choose the data from which is learns. An active learner may ask queries in the form of unlabeled instances to be labeled by an oracle (e.g., a human annotator). Active learning is wellmotivated in many modern machine learning problems, where unlabeled data may be abundant but labels are difficult, timeconsuming, or expensive to obtain. This report provides a general introduction to active learning and a survey of the literature. This includes a discussion of the scenarios in which queries can be formulated, and an overview of the query strategy frameworks proposed in the literature to date. An analysis of the empirical and theoretical evidence for active learning, a summary of several problem setting variants, and a discussion
Agnostic active learning
 In ICML
, 2006
"... We state and analyze the first active learning algorithm which works in the presence of arbitrary forms of noise. The algorithm, A2 (for Agnostic Active), relies only upon the assumption that the samples are drawn i.i.d. from a fixed distribution. We show that A2 achieves an exponential improvement ..."
Abstract

Cited by 125 (13 self)
 Add to MetaCart
We state and analyze the first active learning algorithm which works in the presence of arbitrary forms of noise. The algorithm, A2 (for Agnostic Active), relies only upon the assumption that the samples are drawn i.i.d. from a fixed distribution. We show that A2 achieves an exponential improvement (i.e., requires only O � ln 1 ɛ samples to find an ɛoptimal classifier) over the usual sample complexity of supervised learning, for several settings considered before in the realizable case. These include learning threshold classifiers and learning homogeneous linear separators with respect to an input distribution which is uniform over the unit sphere. 1.
Coarse sample complexity bounds for active learning
 In Neural Information Processing Systems
, 2005
"... ..."
Analysis of perceptronbased active learning
 In COLT
, 2005
"... Abstract. We start by showing that in an active learning setting, the Perceptron algorithm needs \Omega ( 1ffl2) labels to learn linear separators within generalization error ffl. We then present a simple selective sampling algorithm for this problem, which combines a modification of the perceptron ..."
Abstract

Cited by 69 (12 self)
 Add to MetaCart
Abstract. We start by showing that in an active learning setting, the Perceptron algorithm needs \Omega ( 1ffl2) labels to learn linear separators within generalization error ffl. We then present a simple selective sampling algorithm for this problem, which combines a modification of the perceptron update with an adaptive filtering rule for deciding which points to query. For data distributed uniformly over the unit sphere, we show that our algorithm reaches generalization error ffl after asking for just ~O(d log 1ffl) labels. This exponential improvement over the usual sample complexity of supervised learning has previously been demonstrated only for the computationally more complex querybycommittee algorithm. 1 Introduction In many machine learning applications, unlabeled data is abundant but labelingis expensive. This distinction is not captured in the standard PAC or online models of supervised learning, and has motivated the field of active learning, inwhich the labels of data points are initially hidden, and the learner must pay for each label it wishes revealed. If query points are chosen randomly, the numberof labels needed to reach a target generalization error ffl, at a target confidencelevel 1 ffi, is similar to the sample complexity of supervised learning. The hopeis that there are alternative querying strategies which require significantly fewer
A bound on the label complexity of agnostic active learning
 In Proc. of the 24th international conference on Machine learning
, 2007
"... We study the label complexity of poolbased active learning in the agnostic PAC model. Specifically, we derive general bounds on the number of label requests made by the A 2 algorithm proposed by Balcan, Beygelzimer & Langford (Balcan et al., 2006). This represents the first nontrivial generalpurpo ..."
Abstract

Cited by 63 (9 self)
 Add to MetaCart
We study the label complexity of poolbased active learning in the agnostic PAC model. Specifically, we derive general bounds on the number of label requests made by the A 2 algorithm proposed by Balcan, Beygelzimer & Langford (Balcan et al., 2006). This represents the first nontrivial generalpurpose upperboundonlabelcomplexityintheagnostic PAC model. 1.
Minimax bounds for active learning
 In COLT
, 2007
"... Abstract. This paper aims to shed light on achievable limits in active learning. Using minimax analysis techniques, we study the achievable rates of classification error convergence for broad classes of distributions characterized by decision boundary regularity and noise conditions. The results cle ..."
Abstract

Cited by 58 (5 self)
 Add to MetaCart
Abstract. This paper aims to shed light on achievable limits in active learning. Using minimax analysis techniques, we study the achievable rates of classification error convergence for broad classes of distributions characterized by decision boundary regularity and noise conditions. The results clearly indicate the conditions under which one can expect significant gains through active learning. Furthermore we show that the learning rates derived are tight for “boundary fragment ” classes in ddimensional feature spaces when the feature marginal density is bounded from above and below. 1
Margin based active learning
 Proc. of the 20 th Conference on Learning Theory
, 2007
"... Abstract. We present a framework for margin based active learning of linear separators. We instantiate it for a few important cases, some of which have been previously considered in the literature. We analyze the effectiveness of our framework both in the realizable case and in a specific noisy sett ..."
Abstract

Cited by 42 (9 self)
 Add to MetaCart
Abstract. We present a framework for margin based active learning of linear separators. We instantiate it for a few important cases, some of which have been previously considered in the literature. We analyze the effectiveness of our framework both in the realizable case and in a specific noisy setting related to the Tsybakov small noise condition. 1
The True Sample Complexity of Active Learning
"... We describe and explore a new perspective on the sample complexity of active learning. In many situations where it was generally believed that active learning does not help, we find that active learning does help in the limit, often with exponential improvements in sample complexity. This contrasts ..."
Abstract

Cited by 42 (13 self)
 Add to MetaCart
We describe and explore a new perspective on the sample complexity of active learning. In many situations where it was generally believed that active learning does not help, we find that active learning does help in the limit, often with exponential improvements in sample complexity. This contrasts with the traditional analysis of active learning problems such as nonhomogeneous linear separators or depthlimited decision trees, in which Ω(1/ɛ) lower bounds are common; we point out that such results must be interpreted carefully, and that finding an ɛgood classifier can always be accomplished with a number of samples asymptotically smaller than any such bound. These new insights arise from a subtle variation on the traditional definition of sample complexity, not previously recognized in the active learning literature. 1
Active learning for anomaly and rarecategory detection
 In Advances in Neural Information Processing Systems 18
, 2004
"... We introduce a novel activelearning scenario in which a user wants to work with a learning algorithm to identify useful anomalies. These are distinguished from the traditional statistical definition of anomalies as outliers or merely illmodeled points. Our distinction is that the usefulness of ano ..."
Abstract

Cited by 33 (0 self)
 Add to MetaCart
We introduce a novel activelearning scenario in which a user wants to work with a learning algorithm to identify useful anomalies. These are distinguished from the traditional statistical definition of anomalies as outliers or merely illmodeled points. Our distinction is that the usefulness of anomalies is categorized subjectively by the user. We make two additional assumptions. First, there exist extremely few useful anomalies to be hunted down within a massive dataset. Second, both useful and useless anomalies may sometimes exist within tiny classes of similar anomalies. The challenge is thus to identify “rare category ” records in an unlabeled noisy set with help (in the form of class labels) from a human expert who has a small budget of datapoints that they are prepared to categorize. We propose a technique to meet this challenge, which assumes a mixture model fit to the data, but otherwise makes no assumptions on the particular form of the mixture components. This property promises wide applicability in reallife scenarios and for various statistical models. We give an overview of several alternative methods, highlighting their strengths and weaknesses, and conclude with a detailed empirical analysis. We show that our method can quickly zoom in on an anomaly set containing a few tens of points in a dataset of hundreds of thousands. 1
Generalized binary search
 In Proceedings of the 46th Allerton Conference on Communications, Control, and Computing
, 2008
"... This paper addresses the problem of noisy Generalized Binary Search (GBS). GBS is a wellknown greedy algorithm for determining a binaryvalued hypothesis through a sequence of strategically selected queries. At each step, a query is selected that most evenly splits the hypotheses under consideratio ..."
Abstract

Cited by 30 (0 self)
 Add to MetaCart
This paper addresses the problem of noisy Generalized Binary Search (GBS). GBS is a wellknown greedy algorithm for determining a binaryvalued hypothesis through a sequence of strategically selected queries. At each step, a query is selected that most evenly splits the hypotheses under consideration into two disjoint subsets, a natural generalization of the idea underlying classic binary search. GBS is used in many applications, including fault testing, machine diagnostics, disease diagnosis, job scheduling, image processing, computer vision, and active learning. In most of these cases, the responses to queries can be noisy. Past work has provided a partial characterization of GBS, but existing noisetolerant versions of GBS are suboptimal in terms of query complexity. This paper presents an optimal algorithm for noisy GBS and demonstrates its application to learning multidimensional threshold functions. 1