Results 1 - 10
of
22
An Efficient Membership-Query Algorithm for Learning DNF with Respect to the Uniform Distribution
, 1994
"... We present a membership-query algorithm for efficiently learning DNF with respect to the uniform distribution. In fact, the algorithm properly learns with respect to uniform the class TOP of Boolean functions expressed as a majority vote over parity functions. We also describe extensions of this alg ..."
Abstract
-
Cited by 150 (12 self)
- Add to MetaCart
We present a membership-query algorithm for efficiently learning DNF with respect to the uniform distribution. In fact, the algorithm properly learns with respect to uniform the class TOP of Boolean functions expressed as a majority vote over parity functions. We also describe extensions of this algorithm for learning DNF over certain nonuniform distributions and for learning a class of geometric concepts that generalizes DNF. Furthermore, we show that DNF is weakly learnable with respect to uniform from noisy examples. Our strong learning algorithm utilizes one of Freund's boosting techniques and relies on the fact that boosting does not require a completely distribution-independent weak learner. The boosted weak learner is a nonuniform extension of a parity-finding algorithm discovered by Goldreich and Levin. 3 1 Introduction Consider the following 20-questions-like game between two players, Bob and Alice. Bob has a Disjunctive Normal Form (DNF) expression f in mind. Alice is allo...
On the Complexity of Teaching
- Journal of Computer and System Sciences
, 1992
"... While most theoretical work in machine learning has focused on the complexity of learning, recently there has been increasing interest in formally studying the complexity of teaching . In this paper we study the complexity of teaching by considering a variant of the on-line learning model in which a ..."
Abstract
-
Cited by 83 (2 self)
- Add to MetaCart
While most theoretical work in machine learning has focused on the complexity of learning, recently there has been increasing interest in formally studying the complexity of teaching . In this paper we study the complexity of teaching by considering a variant of the on-line learning model in which a helpful teacher selects the instances. We measure the complexity of teaching a concept from a given concept class by a combinatorial measure we call the teaching dimension. Informally, the teaching dimension of a concept class is the minimum number of instances a teacher must reveal to uniquely identify any target concept chosen from the class. A preliminary version of this paper appeared in the Proceedings of the Fourth Annual Workshop on Computational Learning Theory, pages 303--314. August 1991. Most of this research was carried out while both authors were at MIT Laboratory for Computer Science with support provided by ARO Grant DAAL03-86-K-0171, DARPA Contract N00014-89-J-1988, NSF Gr...
General Bounds on Statistical Query Learning and PAC Learning with Noise via Hypothesis Boosting
- in Proceedings of the 34th Annual Symposium on Foundations of Computer Science
, 1993
"... We derive general bounds on the complexity of learning in the Statistical Query model and in the PAC model with classification noise. We do so by considering the problem of boosting the accuracy of weak learning algorithms which fall within the Statistical Query model. This new model was introduced ..."
Abstract
-
Cited by 41 (5 self)
- Add to MetaCart
We derive general bounds on the complexity of learning in the Statistical Query model and in the PAC model with classification noise. We do so by considering the problem of boosting the accuracy of weak learning algorithms which fall within the Statistical Query model. This new model was introduced by Kearns [12] to provide a general framework for efficient PAC learning in the presence of classification noise. We first show a general scheme for boosting the accuracy of weak SQ learning algorithms, proving that weak SQ learning is equivalent to strong SQ learning. The boosting is efficient and is used to show our main result of the first general upper bounds on the complexity of strong SQ learning. Specifically, we derive simultaneous upper bounds with respect to 6 on the number of queries, O(log2:), the Vapnik-Chervonenkis dimension of the query space, O(1og log log +), and the inverse of the minimum tolerance, O(+ log 3). In addition, we show that these general upper bounds are nearly optimal by describing a class of learning problems for which we simultaneously lower bound the number of queries by R(1og f) and the inverse of the minimum tolerance by a(:). We further apply our boosting results in the SQ model to learning in the PAC model with classification noise. Since nearly all PAC learning algorithms can be cast in the SQ model, we can apply our boosting techniques to convert these PAC algorithms into highly efficient SQ algorithms. By simulating these efficient SQ algorithms in the PAC model with classification noise, we show that nearly all PAC algorithms can be converted into highly efficient PAC algorithms which *Author was supported by DARPA Contract N00014-87-K-825 and by NSF Grant CCR-89-14428. Author’s net address: jaaQtheory.lca.rit.edu +.Author was supported by an NDSEG Fellowship and
On Using Extended Statistical Queries to Avoid Membership Queries
- Journal of Machine Learning Research
, 2002
"... The Kushilevitz-Mansour (KM) algorithm is an algorithm that finds all the "large" Fourier coe#cients of a Boolean function. It is the main tool for learning decision trees and DNF expressions in the PAC model with respect to the uniform distribution. The algorithm requires access to the membershi ..."
Abstract
-
Cited by 25 (9 self)
- Add to MetaCart
The Kushilevitz-Mansour (KM) algorithm is an algorithm that finds all the "large" Fourier coe#cients of a Boolean function. It is the main tool for learning decision trees and DNF expressions in the PAC model with respect to the uniform distribution. The algorithm requires access to the membership query (MQ) oracle. The access is often unavailable in learning applications and thus the KM algorithm cannot be used.
Randomly Fallible Teachers: Learning Monotone DNF with an Incomplete Membership Oracle
- Machine Learning
, 1994
"... . We introduce a new fault-tolerant model of algorithmic learning using an equivalence oracle and an incomplete membership oracle, in which the answers to a random subset of the learner's membership queries may be missing. We demonstrate that, with high probability, it is still possible to learn mon ..."
Abstract
-
Cited by 23 (4 self)
- Add to MetaCart
. We introduce a new fault-tolerant model of algorithmic learning using an equivalence oracle and an incomplete membership oracle, in which the answers to a random subset of the learner's membership queries may be missing. We demonstrate that, with high probability, it is still possible to learn monotone DNF formulas in polynomial time, provided that the fraction of missing answers is bounded by some constant less than one. Even when half the membership queries are expected to yield no information, our algorithm will exactly identify m-term, n-variable monotone DNF formulas with an expected O(mn 2 ) queries. The same task has been shown to require exponential time using equivalence queries alone. We extend the algorithm to handle some one-sided errors, and discuss several other possible error models. It is hoped that this work may lead to a better understanding of the power of membership queries and the effects of faulty teachers on query models of concept learning. Keywords: concep...
Learning From a Consistently Ignorant Teacher
, 1994
"... One view of computational learning theory is that of a learner acquiring the knowledge of a teacher. We introduce a formal model of learning capturing the idea that teachers may have gaps in their knowledge. In particular, we consider learning from a teacher who labels examples "+" (a positive in ..."
Abstract
-
Cited by 22 (8 self)
- Add to MetaCart
One view of computational learning theory is that of a learner acquiring the knowledge of a teacher. We introduce a formal model of learning capturing the idea that teachers may have gaps in their knowledge. In particular, we consider learning from a teacher who labels examples "+" (a positive instance of the concept being learned), "\Gamma" (a negative instance of the concept being learned), and "?" (an instance with unknown classification), in such a way that knowledge of the concept class and all the positive and negative examples is not sufficient to determine the labelling of any of the examples labelled with "?". The goal of the learner is not to compensate for the ignorance of the teacher by attempting to infer "+" or "\Gamma" labels for the examples labelled with "?", but is rather to learn (an approximation to) the ternary labelling presented by the teacher. Thus, the goal of the learner is still to acquire the knowledge of the teacher, but now the learner must also ...
Probabilistic Analysis of Learning in Artificial Neural Networks: The PAC Model and its Variants
, 1994
"... There are a number of mathematical approaches to the study of learning and generalization in artificial neural networks. Here we survey the `probably approximately correct' (PAC) model of learning and some of its variants. These models, much-studied since the introduction of the basic PAC model ..."
Abstract
-
Cited by 16 (4 self)
- Add to MetaCart
There are a number of mathematical approaches to the study of learning and generalization in artificial neural networks. Here we survey the `probably approximately correct' (PAC) model of learning and some of its variants. These models, much-studied since the introduction of the basic PAC model by Valiant in 1984, provide a probabilistic framework for the discussion of generalization and learning. CONTENTS 3 Contents 1 Introduction 4 2 The Basic PAC Model of Learning 5 3 VC-Dimension and Growth Function 8 4 VC-Dimension and Linear Dimension 10 5 A Useful Probability Theorem 12 6 PAC Learning and the VC-Dimension 16 7 VC-Dimension of Binary-Output Networks 19 7.1 Introduction 19 7.2 Linearly weighted neural networks 21 7.3 Linear threshold networks 22 7.4 Other activation functions 26 7.5 The effect of weight restrictions 29 8 Computational Complexity of Learning 30 9 Stochastic Concepts 36 10 Distribution-Specific Learning 39 11 Graph Dimension and Multiple-Output Nets 42 11.1 T...
Learning Nonoverlapping Perceptron Networks From Examples and Membership Queries
, 1994
"... We investigate, within the PAC learning model, the problem of learning nonoverlapping perceptron networks (also known as read-once formulas over a weighted threshold basis). These are loop-free neural nets in which each node has only one outgoing weight. We give a polynomial time algorithm that P ..."
Abstract
-
Cited by 13 (6 self)
- Add to MetaCart
We investigate, within the PAC learning model, the problem of learning nonoverlapping perceptron networks (also known as read-once formulas over a weighted threshold basis). These are loop-free neural nets in which each node has only one outgoing weight. We give a polynomial time algorithm that PAC learns any nonoverlapping perceptron network using examples and membership queries. The algorithm is able to identify both the architecture and the weight values necessary to represent the function to be learned. Our results shed some light on the e#ect of the overlap on the complexity of learning in neural networks. Keywords: Neural networks, PAC learning, nonoverlapping, read-once formula, learning with queries 2 1 Introduction Despite the excitement generated recently by neural networks, learning in these systems has proven to be very di#cult from a theoretical perspective (Blum and Rivest, 1988; Judd, 1988; Kearns and Valiant, 1988; Lin and Vitter, 1991). For this reason resea...

