## Active learning using arbitrary binary valued queries (1993)

### Cached

### Download Links

- [www.mit.edu]
- [web.mit.edu]
- [www.mit.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | Machine Learning |

Citations: | 26 - 1 self |

### BibTeX

@INPROCEEDINGS{Kulkarni93activelearning,

author = {S. R. Kulkarni and David Haussler},

title = {Active learning using arbitrary binary valued queries},

booktitle = {Machine Learning},

year = {1993},

pages = {23--35}

}

### Years of Citing Articles

### OpenURL

### Abstract

Abstract. The original and most widely studied PAC model for learning assumes a passive learner in the sense that the learner plays no role in obtaining information about the unknown concept. That is, the samples are simply drawn independently from some probability distribution. Some work has been done on studying more powerful oracles and how they affect learnability. To find bounds on the improvement in sample complexity that can be expected from using oracles, we consider active learning in the sense that the learner has complete control over the information received. Specifically, we allow the learner to ask arbitrary yes/no questions. We consider both active learning under a fixed distribution and distribution-free active learning. In the case of active learning, the underlying probability distribution is used only to measure distance between concepts. For learnability with respect to a fixed distribution, active learning does not enlarge the set of learnable concept classes, but can improve the sample complexity. For distribution-free learning, it is shown that a concept class is actively learnable iff it is finite, so that active learning is in fact less powerful than the usual passive learning model. We also consider a form of distribution-free learning in which the learner knows the distribution being used, so that "distributionfree" refers only to the requirement that a bound on the number of queries can be obtained uniformly over all distributions. Even with the side information of the distribution being used, a concept class is actively learnable iff it has finite VC dimension, so that active learning with the side information still does not enlarge the set of learnable concept classes. Keywords: PAC-learning, active learning, queries, oracles 1.

### Citations

1695 | A Theory of the Learnable
- Valiant
- 1984
(Show Context)
Citation Context ...formation still does not enlarge the set of learnable concept classes. Keywords: PAC-learning, active learning, queries, oracles 1. Introduction The PAC learning model (e.g., see Blumer et al., 1986; =-=Valiant, 1984-=-) provides a framework for studying the problem of learning from examples. In this model, the learner attempts to approximate an unknown concept from a set of positive and negative examples of the con... |

1483 |
Information Theory and Reliable Communications
- Gallager
- 1968
(Show Context)
Citation Context ...active learning, the learner wants to encode the concept class to an accuracy e with a binary alphabet, so the situation is essentially an elementary problem in source coding from information theory (=-=Gallager, 1968-=-). However, the learner wants to minimize the length of the longest codeword rather than the mean codeword length. Theorem 1A concept class Cis actively learnable with respect to a distribution P iffN... |

647 | Queries and concept learning - Angluin - 1988 |

567 | Convergence of Stochastic Processes - Pollard - 1984 |

372 | Decision theoretic generalizations of the PAC model for neural net and other learning applications - Haussler - 1992 |

221 |
Learning from noisy examples
- Angluin, Laird
- 1988
(Show Context)
Citation Context ...completeness, we mention that results can also be obtained if the learner is provided with "noisy" answers to the queries. The effects of various types of noise in passive learning have been studied (=-=Angluin & Laird, 1988-=-; Kearns & Li, 1988; Sloan, 1988). For active learning, two natural noise models are random noise in which the answer to a query is incorrect with some probability t\ < 1/2 independent of other querie... |

191 |
A general lower bound on the number of examples needed for learning
- Ehrenfeucht, Haussler, et al.
- 1989
(Show Context)
Citation Context ...(t + 6 - e6))) samples are necessary and max(4/e log 21 a, %dlt log 8d/e) samples are sufficient for learnability, although since their work some refinements have been made in these bounds (e.g., see =-=Ehrenfeucht et al., 1989-=-). The case of distribution-free active learnability is a little more subtle than active learnability for a fixed distribution. For both active and passive learning, the requirement that the learning ... |

167 | Learning in the presence of malicious errors - Kearns, Li - 1993 |

135 |
Central limit theorems for empirical measures
- Dudley
- 1978
(Show Context)
Citation Context ... N(e) is the smallest number of points possible in such a finite approximation of Y. The notion of metric entropy for various metric spaces has been studied and used by a number of authors (e.g., see =-=Dudley, 1978-=-; Kolmogorov & Tihomirov, 1961; Tikhomirov, 1963). In the present application, the measure of error fl?X c i> C 2) = P(c\&c-i) between two concepts with respect to a distribution P is a pseudo-metric.... |

68 |
Classifying learnable geometric concepts with the Vapnik-Chervonenkis dimension
- BLIJMER, EHRENFEUCHT, et al.
- 1986
(Show Context)
Citation Context ...ning with the side information still does not enlarge the set of learnable concept classes. Keywords: PAC-learning, active learning, queries, oracles 1. Introduction The PAC learning model (e.g., see =-=Blumer et al., 1986-=-; Valiant, 1984) provides a framework for studying the problem of learning from examples. In this model, the learner attempts to approximate an unknown concept from a set of positive and negative exam... |

47 |
Types of noise in data for concept learning
- Sloan
- 1988
(Show Context)
Citation Context ...also be obtained if the learner is provided with "noisy" answers to the queries. The effects of various types of noise in passive learning have been studied (Angluin & Laird, 1988; Kearns & Li, 1988; =-=Sloan, 1988-=-). For active learning, two natural noise models are random noise in which the answer to a query is incorrect with some probability t\ < 1/2 independent of other queries, and malicious noise in which ... |

40 |
Coping with errors in binary search procedures
- Kleitman, Meyer, et al.
- 1980
(Show Context)
Citation Context ...n theory on the capacity and coding for such channels (Gallager, 1968) can be applied for this model of random noise. For malicious noise, some results on binary searching with these types of errors (=-=Rivest et al., 1980-=-) can be applied. For both noise models, the conditions for fixed distribution and distribution-free learnability are the same as the noise-free case, but with a larger sample complexity. However, the... |

39 |
Learnability by fixed distributions
- Benedek, Itai
- 1988
(Show Context)
Citation Context ...siderable amount of work has been done along these lines. For example, learnability with respect to a class of distributions (as opposed to the original distribution-free framework) has been studied (=-=Benedek & Itai, 1988-=-; Kulkarni, 1989, 1991; Natarajan, 1988, 1989). Notably, Benedek and Itai (1988) first studied learnability with respect to a fixed and known probability distribution, and characterized learnability i... |

26 | On the sample complexity of paclearning using random and chosen examples - Eisenberg, Rivest - 1990 |

18 | On the uniform convergence of relative frequencies to their probabilities. Theory Probab - Chervonenkis - 1971 |

16 | Extending the valiant learning model - Amsterdam - 1988 |

13 | Learning by distances - Ben-David, Itai, et al. - 1990 |

10 | Problems of computational and information complexity in machine vision and learning - Kulkarni - 1991 |

9 | On metric entropy, Vapnik-Chervonenkis dimension, and learnability for a class of distributions
- Kulkarni
- 1989
(Show Context)
Citation Context ...rk has been done along these lines. For example, learnability with respect to a class of distributions (as opposed to the original distribution-free framework) has been studied (Benedek & Itai, 1988; =-=Kulkarni, 1989-=-, 1991; Natarajan, 1988, 1989). Notably, Benedek and Itai (1988) first studied learnability with respect to a fixed and known probability distribution, and characterized learnability in this case in t... |

8 |
e-Entropy and e-capacity of sets in functional spaces
- Kolmogorov, Tihomirov
- 1961
(Show Context)
Citation Context ...mallest number of points possible in such a finite approximation of Y. The notion of metric entropy for various metric spaces has been studied and used by a number of authors (e.g., see Dudley, 1978; =-=Kolmogorov & Tihomirov, 1961-=-; Tikhomirov, 1963). In the present application, the measure of error fl?X c i> C 2) = P(c\&c-i) between two concepts with respect to a distribution P is a pseudo-metric. Note that dP(\ •) is generall... |

8 | Learning over classes of distributions - Natarajan - 1988 |

1 | Types of queries for concept learning (Technical Report YALEU/DCS/TR-479 - Angluin - 1986 |

1 | Learning over classes of distributions - thesis - 1988 |

1 |
Kolmogorov's work on f-entropy of functional classes and the superposition of functions
- Tikhomirov
- 1963
(Show Context)
Citation Context ...ble in such a finite approximation of Y. The notion of metric entropy for various metric spaces has been studied and used by a number of authors (e.g., see Dudley, 1978; Kolmogorov & Tihomirov, 1961; =-=Tikhomirov, 1963-=-). In the present application, the measure of error fl?X c i> C 2) = P(c\&c-i) between two concepts with respect to a distribution P is a pseudo-metric. Note that dP(\ •) is generally only a pseudo-me... |