## Weakly Learning DNF and Characterizing Statistical Query Learning Using Fourier Analysis (1994)

### Cached

### Download Links

- [www.mathcs.duq.edu]
- [www-cgi.cs.cmu.edu]
- [www.cs.cmu.edu]
- [www.cis.upenn.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | IN PROCEEDINGS OF THE TWENTY-SIXTH ANNUAL SYMPOSIUM ON THEORY OF COMPUTING |

Citations: | 118 - 22 self |

### BibTeX

@INPROCEEDINGS{Blum94weaklylearning,

author = {Avrim Blum and Merrick Furst and Jeffrey Jackson and Michael Kearns and Yishay Mansour and Steven Rudich},

title = {Weakly Learning DNF and Characterizing Statistical Query Learning Using Fourier Analysis},

booktitle = {IN PROCEEDINGS OF THE TWENTY-SIXTH ANNUAL SYMPOSIUM ON THEORY OF COMPUTING},

year = {1994},

pages = {253--262},

publisher = {ACM Press}

}

### Years of Citing Articles

### OpenURL

### Abstract

We present new results on the well-studied problem of learning DNF expressions. We prove that an algorithm due to Kushilevitz and Mansour [13] can be used to weakly learn DNF formulas with membership queries with respect to the uniform distribution. This is the rst positive result known for learning general DNF in polynomial time in a nontrivial model. Our results should be contrasted with those of Kharitonov [12], who proved that AC 0 is not eciently learnable in this model based on cryptographic assumptions. We also present ecient learning algorithms in various models for the read-k and SAT-k subclasses of DNF. We then turn our attention to the recently introduced statistical query model of learning [9]. This model is a restricted version of the popular Probably Approximately Correct (PAC) model, and practically every PAC learning algorithm falls into the statistical query model [9]. We prove that DNF and decision trees are not even weakly learnable in polynomial time in this model. This result is information-theoretic and therefore does not rely on any unproven assumptions, and demonstrates that no straightforward modication of the existing algorithms for learning various restricted forms of DNF and decision trees will solve the general problem. These lower bounds are a corollary of a more general characterization of the complexity of statistical query learning in terms of the number of uncorrelated functions in the concept class. The underlying tool for all of our results is the Fourier analysis of the concept class to be learned.

### Citations

3351 | Induction of decision trees
- Quinlan
- 1986
(Show Context)
Citation Context ...icted forms of DNF and decision trees from passive random examples (and also several algorithms proposed in the experimental machine learning communities, such as the ID3 algorithm for decision trees =-=[22]-=- and its variants) will solve the general problem. The unifying tool for all of our results is the Fourier analysis of a finite class of boolean functions on the hypercube. 1 Introduction and History ... |

1693 | A theory of the learnable
- Valiant
- 1984
(Show Context)
Citation Context ...ing DNF expressions. The problem of eciently learning DNF formulas in any nontrivial model of learning has been of central interest in computational learning theory since the seminal paper of Valiant =-=[18]-=- introducing the popular Probably Approximately Correct (PAC) learning model. Despite the importance of this problem, to date no polynomial time algorithm for learning unrestricted DNF has been discov... |

624 |
Learnability and the vapnik-chervonenkis dimension
- Blumer, Ehrenfeucht, et al.
- 1989
(Show Context)
Citation Context ...Vapnik-Chervonenkis (VC) dimension, which is a distribution-independent quantity and is known to characterize the number of random examples required to learn in the distribution-independent PAC model =-=[6]-=-, the statistical query dimension is a distribution-dep endent quantity. It is possible to prove a one-sided polynomial relationship between the two quantities: namely, if F is a class of VC dimension... |

288 | Efficient noise-tolerant learning from statistical queries
- Kearns
- 1993
(Show Context)
Citation Context ...ons. We also present ecient learning algorithms in various models for the read-k and SAT-k subclasses of DNF. We then turn our attention to the recently introduced statistical query model of learning =-=[9]-=-. This model is a restricted version of the popular Probably Approximately Correct (PAC) model, and practically every PAC learning algorithm falls into the statistical query model [9]. We prove that D... |

278 |
Constant depth circuits, Fourier transform, and learnability
- Linial, Mansour, et al.
- 1993
(Show Context)
Citation Context ...n of the existing algorithms for learning various restricted forms of DNF and decision trees will solve the general problem. All of our results rely heavily on the Fourier representation of functions =-=[15, 13, 17-=-], demonstrating once again the utility of these tools in computational learning theory. 1 2 Denitions and Notation 2.1 Learning Models A concept is a boolean function on an instance space X, and for ... |

194 | Toward efficient agnostic learning
- Kearns, Schapire, et al.
- 1994
(Show Context)
Citation Context ...l inputs ~x as well as h's random choices, we get Pr[h 6= f ]s1 2 E[(f \Gamma g) 2 ]: (Lemma 3) A similar but slightly weaker randomized approximation method was given by Kearns, Schapire, and Sellie =-=[13]-=-. Putting the results of this section together, we have the following. Theorem 4 A concept class F is weakly learnable with membership queries with respect to the uniform distribution if there are pol... |

190 | Computational limitations on learning from examples
- Pitt, Valiant
- 1988
(Show Context)
Citation Context ...entation-dependent hardness results (that is, hardness results that assume certain syntactic restrictions on the learning algorithm's hypothesis) for learning even some rather restricted forms of DNF =-=[21, 12]-=-. These hardness results left unresolved the status of learning DNF formulas in the absence of hypothesis restrictions, or with respect to the uniform distribution, or using membership queries. We pro... |

183 | Learning decision trees using the Fourier spectrum
- Kushilevitz, Mansour
- 1993
(Show Context)
Citation Context ...iv U. Steven Rudich Carnegie Mellon U. November 1993 Abstract We present new results on the well-studied problem of learning DNF expressions. We prove that an algorithm due to Kushilevitz and Mansour =-=[13]-=- can be used to weakly learn DNF formulas with membership queries with respect to the uniform distribution. This is thesrst positive result known for learning general DNF in polynomial time in a nontr... |

167 | On the learnability of boolean formulae - Kearns, Li, et al. - 1987 |

111 | Learning conjunctions of Horn clauses
- Angluin, Frazier, et al.
- 1992
(Show Context)
Citation Context ... his results would soon be extended to DNF; our result shows otherwise.) Due to the lack of positive results for unrestricted DNF, various restricted DNF classes have attracted considerable attention =-=[4, 2, 8, 3, 1, 5, 14, 6]-=-. We extend these results. In particular, it is known that the class of read-k DNF (DNF in which every variable appears at most k times) is learnable in polynomial time using membership queries for k ... |

85 | Harmonic analysis of polynomial threshold functions
- Bruck
- 1990
(Show Context)
Citation Context ...niform using queries. c PT 1 is a rather general class containing many functions, such as majority, that are not approximable by AC 0 circuits. Our weak learnability proof builds on the work of Bruck =-=[7]-=-. Theorem 11 c PT 1 is weakly learnable using membership queries with respect to the uniform distribution. Proof: By definition, for any f 2 c PT 1 there is some g = P A2Ssg(A)A such that f = sign(g),... |

82 |
Exact learning via the monotone theory
- Bshouty
- 1993
(Show Context)
Citation Context ... his results would soon be extended to DNF; our result shows otherwise.) Due to the lack of positive results for unrestricted DNF, various restricted DNF classes have attracted considerable attention =-=[4, 2, 8, 3, 1, 5, 14, 6]-=-. We extend these results. In particular, it is known that the class of read-k DNF (DNF in which every variable appears at most k times) is learnable in polynomial time using membership queries for k ... |

74 |
Cryptographic hardness of distribution-specific learning
- Kharitonov
- 1993
(Show Context)
Citation Context ...respect to the uniform distribution. This is thesrst positive result known for learning general DNF in polynomial time in a nontrivial model. Our results should be contrasted with those of Kharitonov =-=[12]-=-, who proved that AC 0 is not eciently learnable in this model based on cryptographic assumptions. We also present ecient learning algorithms in various models for the read-k and SAT-k subclasses of D... |

51 | Fast learning of k-term DNF formulas with queries
- Blum, Rudich
- 1995
(Show Context)
Citation Context ... his results would soon be extended to DNF; our result shows otherwise.) Due to the lack of positive results for unrestricted DNF, various restricted DNF classes have attracted considerable attention =-=[4, 2, 8, 3, 1, 5, 14, 6]-=-. We extend these results. In particular, it is known that the class of read-k DNF (DNF in which every variable appears at most k times) is learnable in polynomial time using membership queries for k ... |

48 | An O.n log log n / learning algorithm for DNF under the uniform distribution
- Mansour
- 1995
(Show Context)
Citation Context ...n of the existing algorithms for learning various restricted forms of DNF and decision trees will solve the general problem. All of our results rely heavily on the Fourier representation of functions =-=[15, 13, 17-=-], demonstrating once again the utility of these tools in computational learning theory. 1 2 Denitions and Notation 2.1 Learning Models A concept is a boolean function on an instance space X, and for ... |

35 |
Exact learning of read-twice DNF formulas
- Aizenstein, Pitt
- 1991
(Show Context)
Citation Context |

27 | Improved learning of AC 0 functions - Furst, Jackson, et al. - 1991 |

24 | On learning visual concepts and DNF formulae
- Kushilevitz, Roth
- 1996
(Show Context)
Citation Context |

23 |
Learning 2DNF formulas and k decision trees
- Hancock
- 1991
(Show Context)
Citation Context |

23 | On using the Fourier transform to learn disjoint DNF
- Khardon
- 1994
(Show Context)
Citation Context ...9) By restricting the size of terms in the SAT-k DNF's considered we can extend the above to a distribution-free learning result (this generalizes a similar result for SAT-1 (disjoint) DNF by Khardon =-=[11]-=-). Theorem 10 For any k, the class of SAT-k O(log s)-DNF formulas of s terms can be learned exactly by a deterministic learning algorithm which uses membership queries and runs in time polynomial in n... |

22 |
Read-thrice DNF is hard to learn with membership and equivalence queries
- Aizenstein, Hellerstein, et al.
- 1992
(Show Context)
Citation Context |

17 |
Exact Learning of Read-k Disjoint DNF and Not-So-Disjoint DNF
- Aizenstein, Pitt
- 1992
(Show Context)
Citation Context |

17 |
Improved learning of AC ~ functions
- Furst, Jackson, et al.
- 1991
(Show Context)
Citation Context ...his theorem, we will need to use an extension of the Fourier theory to an arbitrary distribution; this extension has been examined in the learning theory literature before by Furst, Jackson and Smith =-=[7-=-]. Thus let D be an arbitrary probability distribution over f0; 1g n . Then for any two real-valued functions f and g over f0; 1g n , we can dene the inner product with respect to D by hf; gi D = ED [... |

16 |
Toward Ecient Agnostic Learning
- Kearns, Schapire, et al.
- 1992
(Show Context)
Citation Context ...instances x as well as h's random choices, we get Pr[h(x) 6= f(x)] 1 2 E[(f g) 2 ]: (Lemma 3) A similar but slightly weaker randomized approximation method was given by Kearns, Schapire, and Sellie [=-=10]-=-. Putting the results of this section together, we have the following. Theorem 4 A concept class F is weakly learnable with membership queries with respect to the uniform distribution if there are pol... |

12 | Toward e cient agnostic learning - Kearns, Schapire, et al. - 1994 |