Results 1  10
of
90
An Efficient MembershipQuery Algorithm for Learning DNF with Respect to the Uniform Distribution
, 1994
"... We present a membershipquery algorithm for efficiently learning DNF with respect to the uniform distribution. In fact, the algorithm properly learns with respect to uniform the class TOP of Boolean functions expressed as a majority vote over parity functions. We also describe extensions of this alg ..."
Abstract

Cited by 165 (13 self)
 Add to MetaCart
We present a membershipquery algorithm for efficiently learning DNF with respect to the uniform distribution. In fact, the algorithm properly learns with respect to uniform the class TOP of Boolean functions expressed as a majority vote over parity functions. We also describe extensions of this algorithm for learning DNF over certain nonuniform distributions and for learning a class of geometric concepts that generalizes DNF. Furthermore, we show that DNF is weakly learnable with respect to uniform from noisy examples. Our strong learning algorithm utilizes one of Freund's boosting techniques and relies on the fact that boosting does not require a completely distributionindependent weak learner. The boosted weak learner is a nonuniform extension of a parityfinding algorithm discovered by Goldreich and Levin. 3 1 Introduction Consider the following 20questionslike game between two players, Bob and Alice. Bob has a Disjunctive Normal Form (DNF) expression f in mind. Alice is allo...
Weakly Learning DNF and Characterizing Statistical Query Learning Using Fourier Analysis
 IN PROCEEDINGS OF THE TWENTYSIXTH ANNUAL SYMPOSIUM ON THEORY OF COMPUTING
, 1994
"... We present new results on the wellstudied problem of learning DNF expressions. We prove that an algorithm due to Kushilevitz and Mansour [13] can be used to weakly learn DNF formulas with membership queries with respect to the uniform distribution. This is the rst positive result known for learn ..."
Abstract

Cited by 119 (22 self)
 Add to MetaCart
We present new results on the wellstudied problem of learning DNF expressions. We prove that an algorithm due to Kushilevitz and Mansour [13] can be used to weakly learn DNF formulas with membership queries with respect to the uniform distribution. This is the rst positive result known for learning general DNF in polynomial time in a nontrivial model. Our results should be contrasted with those of Kharitonov [12], who proved that AC 0 is not eciently learnable in this model based on cryptographic assumptions. We also present ecient learning algorithms in various models for the readk and SATk subclasses of DNF. We then turn our attention to the recently introduced statistical query model of learning [9]. This model is a restricted version of the popular Probably Approximately Correct (PAC) model, and practically every PAC learning algorithm falls into the statistical query model [9]. We prove that DNF and decision trees are not even weakly learnable in polynomial time in this model. This result is informationtheoretic and therefore does not rely on any unproven assumptions, and demonstrates that no straightforward modication of the existing algorithms for learning various restricted forms of DNF and decision trees will solve the general problem. These lower bounds are a corollary of a more general characterization of the complexity of statistical query learning in terms of the number of uncorrelated functions in the concept class. The underlying tool for all of our results is the Fourier analysis of the concept class to be learned.
Learning ReadOnce Formulas with Queries
 J. ACM
, 1989
"... A readonce formula is a boolean formula in which each variable occurs at most once. Such formulas are also called ¯formulas or boolean trees. This paper treats the problem of exactly identifying an unknown readonce formula using specific kinds of queries. The main results are a polynomial time al ..."
Abstract

Cited by 110 (23 self)
 Add to MetaCart
A readonce formula is a boolean formula in which each variable occurs at most once. Such formulas are also called ¯formulas or boolean trees. This paper treats the problem of exactly identifying an unknown readonce formula using specific kinds of queries. The main results are a polynomial time algorithm for exact identification of monotone readonce formulas using only membership queries, and a polynomial time algorithm for exact identification of general readonce formulas using equivalence and membership queries (a protocol based on the notion of a minimally adequate teacher [1]). Our results improve on Valiant's previous results for readonce formulas [26]. We also show that no polynomial time algorithm using only membership queries or only equivalence queries can exactly identify all readonce formulas. 1 Introduction The goal of computational learning theory is to define and study useful models of learning phenomena from an algorithmic point of view. Since there are a variety ...
Structure Identification in Relational Data
, 1997
"... This paper presents several investigations into the prospects for identifying meaningful structures in empirical data, namely, structures permitting effective organization of the data to meet requirements of future queries. We propose a general framework whereby the notion of identifiability is give ..."
Abstract

Cited by 77 (2 self)
 Add to MetaCart
This paper presents several investigations into the prospects for identifying meaningful structures in empirical data, namely, structures permitting effective organization of the data to meet requirements of future queries. We propose a general framework whereby the notion of identifiability is given a precise formal definition similar to that of learnability. Using this framework, we then explore if a tractable procedure exists for deciding whether a given relation is decomposable into a constraint network or a CNF theory with desirable topology and, if the answer is positive, identifying the desired decomposition. Finally, we
How Many Queries are Needed to Learn?
, 1996
"... We investigate the query complexity of exact learning in the membership and (proper) equivalence query model. We give a complete characterization of concept classes that are learnable with a polynomial number of polynomial sized queries in this model. We give applications of this characterization, i ..."
Abstract

Cited by 65 (8 self)
 Add to MetaCart
We investigate the query complexity of exact learning in the membership and (proper) equivalence query model. We give a complete characterization of concept classes that are learnable with a polynomial number of polynomial sized queries in this model. We give applications of this characterization, including results on learning a natural subclass of DNF formulas, and on learning with membership queries alone. Query complexity has previously been used to prove lower bounds on the time complexity of exact learning. We show a new relationship between query complexity and time complexity in exact learning: If any "honest" class is exactly and properly learnable with polynomial query complexity, but not learnable in polynomial time, then P<F NaN> 6= NP. In particular, we show that an honest class is exactly polynomialquery learnable if and only if it is learnable using an oracle for \Sigma p 4 . 1 Introduction Today concept learning is studied under two rigorous frameworks which model t...
Exact Learning Boolean Functions via the Monotone Theory
 INFORMATION AND COMPUTATION
, 1995
"... We study the learnability of boolean functions from membership and equivalence queries. We develop the Monotone Theory that proves 1) Any boolean function is learnable in polynomial time in its minimal DNF size, its minimal CNF size and the number of variables n. In particular, 2) Decision tree ..."
Abstract

Cited by 50 (3 self)
 Add to MetaCart
We study the learnability of boolean functions from membership and equivalence queries. We develop the Monotone Theory that proves 1) Any boolean function is learnable in polynomial time in its minimal DNF size, its minimal CNF size and the number of variables n. In particular, 2) Decision trees are learnable. Our algorithms are in the model of exact learning with membership queries and unrestricted equivalence queires. The hypotheses to the equivalence queries and the output hypotheses are depth 3 formulas.
Teaching a Smarter Learner
 Journal of Computer and System Sciences
, 1994
"... We introduce a formal model of teaching in which the teacher is tailored to a particular learner, yet the teaching protocol is designed so that no collusion is possible. Not surprisingly, such a model remedies the nonintuitive aspects of other models in which the teacher must successfully teach ..."
Abstract

Cited by 40 (1 self)
 Add to MetaCart
We introduce a formal model of teaching in which the teacher is tailored to a particular learner, yet the teaching protocol is designed so that no collusion is possible. Not surprisingly, such a model remedies the nonintuitive aspects of other models in which the teacher must successfully teach any consistent learner. We prove that any class that can be exactly identified by a deterministic polynomialtime algorithm with access to a very rich set of examplebased queries is teachable by a computationally unbounded teacher and a polynomialtime learner. In addition, we present other general results relating this model of teaching to various previous results. We also consider the problem of designing teacher/learner pairs in which both the teacher and learner are polynomialtime algorithms and describe teacher/learner pairs for the classes of 1decision lists and Horn sentences. 1 Introduction Recently, there has been interest in developing formal models of teaching [4, 10, ...
Probably Approximately Correct Learning
 Proceedings of the Eighth National Conference on Artificial Intelligence
, 1990
"... This paper surveys some recent theoretical results on the efficiency of machine learning algorithms. The main tool described is the notion of Probably Approximately Correct (PAC) learning, introduced by Valiant. We define this learning model and then look at some of the results obtained in it. We th ..."
Abstract

Cited by 40 (1 self)
 Add to MetaCart
This paper surveys some recent theoretical results on the efficiency of machine learning algorithms. The main tool described is the notion of Probably Approximately Correct (PAC) learning, introduced by Valiant. We define this learning model and then look at some of the results obtained in it. We then consider some criticisms of the PAC model and the extensions proposed to address these criticisms. Finally, we look briefly at other models recently proposed in computational learning theory. 2 Introduction It's a dangerous thing to try to formalize an enterprise as complex and varied as machine learning so that it can be subjected to rigorous mathematical analysis. To be tractable, a formal model must be simple. Thus, inevitably, most people will feel that important aspects of the activity have been left out of the theory. Of course, they will be right. Therefore, it is not advisable to present a theory of machine learning as having reduced the entire field to its bare essentials. All ...
Horn Approximations of Empirical Data
 Artificial Intelligence
, 1995
"... Formal AI systems traditionally represent knowledge using logical formulas. Sometimes, however, a modelbased representation is more compact and enables faster reasoning than the corresponding formulabased representation. The central idea behind our work is to represent a large set of models by a s ..."
Abstract

Cited by 33 (2 self)
 Add to MetaCart
Formal AI systems traditionally represent knowledge using logical formulas. Sometimes, however, a modelbased representation is more compact and enables faster reasoning than the corresponding formulabased representation. The central idea behind our work is to represent a large set of models by a subset of characteristic models. More specifically, we examine modelbased representations of Horn theories, and show that there are large Horn theories that can be exactly represented by an exponentially smaller set of characteristic models. We show that deduction based on a set of characteristic models requires only polynomial time, as it does using Horn theories. More surprisingly, abduction can be performed in polynomial time using a set of characteristic models, whereas abduction using Horn theories is NPcomplete. Finally, we discuss algorithms for generating efficient representations of the Horn theory that best approximates a general set of models. 1 Introduction Logical formulas are...
The Inverse Satisfiability Problem
 SIAM Journal on Computing
, 1998
"... We study the complexity of telling whether a set of bitvectors represents the set of all satisfying truth assignments of a Boolean expression of a certain type. We show that the problem is coNPcomplete when the expression is required to be in conjunctive normal form with three literals per clause ..."
Abstract

Cited by 33 (6 self)
 Add to MetaCart
We study the complexity of telling whether a set of bitvectors represents the set of all satisfying truth assignments of a Boolean expression of a certain type. We show that the problem is coNPcomplete when the expression is required to be in conjunctive normal form with three literals per clause (3CNF). We also prove a dichotomy theorem analogous to the classical one by Schaefer, stating that, unless P=NP, the problem can be solved in polynomial time if and only if the clauses allowed are all Horn, or all antiHorn, or all 2CNF, or all equivalent to equations modulo two.