Results 1  10
of
19
Learning Intersections and Thresholds of Halfspaces
"... We give the first polynomial time algorithm to learn any function of a constant number of halfspaces under the uniform distribution to within any constant error parameter. We also give the first quasipolynomial time algorithm for learning any function of a polylog number of polynomialweight halfsp ..."
Abstract

Cited by 65 (22 self)
 Add to MetaCart
We give the first polynomial time algorithm to learn any function of a constant number of halfspaces under the uniform distribution to within any constant error parameter. We also give the first quasipolynomial time algorithm for learning any function of a polylog number of polynomialweight halfspaces under any distribution. As special cases of these results we obtain algorithms for learning intersections and thresholds of halfspaces. Our uniform distribution learning algorithms involve a novel nongeometric approach to learning halfspaces; we use Fourier techniques together with a careful analysis of the noise sensitivity of functions of halfspaces. Our algorithms for learning under any distribution use techniques from real approximation theory to construct low degree polynomial threshold functions.
Software Abstractions
, 2006
"... We give an algorithm that with high probability properly learns random monotone t(n)term DNF under the uniform distribution on the Boolean cube {0, 1} n. For any polynomially bounded function t(n) ≤ poly(n) the algorithm runs in time poly(n, 1/ɛ) and with high probability outputs an ɛaccurate mon ..."
Abstract

Cited by 32 (2 self)
 Add to MetaCart
We give an algorithm that with high probability properly learns random monotone t(n)term DNF under the uniform distribution on the Boolean cube {0, 1} n. For any polynomially bounded function t(n) ≤ poly(n) the algorithm runs in time poly(n, 1/ɛ) and with high probability outputs an ɛaccurate monotone DNF hypothesis. This is the first algorithm that can learn monotone DNF of arbitrary polynomial size in a reasonable averagecase model of learning from random examples only.
Approximate Lineage for Probabilistic Databases
"... In probabilistic databases, lineage is fundamental to both query processing and understanding the data. Current systems s.a. Trio or Mystiq use a complete approach in which the lineage for a tuple t is a Boolean formula which represents all derivations of t. In large databases lineage formulas can b ..."
Abstract

Cited by 23 (7 self)
 Add to MetaCart
In probabilistic databases, lineage is fundamental to both query processing and understanding the data. Current systems s.a. Trio or Mystiq use a complete approach in which the lineage for a tuple t is a Boolean formula which represents all derivations of t. In large databases lineage formulas can become huge: in one public database (the Gene Ontology) we often observed 10MB of lineage (provenance) data for a single tuple. In this paper we propose to use approximate lineage, which is a much smaller formula keeping track of only the most important derivations, which the system can use to process queries and provide explanations. We discuss in detail two specific kinds of approximate lineage: (1) a conservative approximation called sufficient lineage that records the most important derivations for each tuple, and (2) polynomial lineage, which is more aggressive and can provide higher compression ratios, and which is based on Fourier approximations of Boolean expressions. In this paper we define approximate lineage formally, describe algorithms to compute approximate lineage and prove formally their error bounds, and validate our approach experimentally on a real data set. 1.
Learning Monotone Decision Trees in Polynomial Time
, 2005
"... We give an algorithm that learns any monotone Boolean function f: {1, 1}n! {1, 1}to any constant accuracy, under the uniform distribution, in time polynomial in n and in thedecision tree size of f. This is the first algorithm that can learn arbitrary monotone Booleanfunctions to high accuracy, us ..."
Abstract

Cited by 21 (5 self)
 Add to MetaCart
We give an algorithm that learns any monotone Boolean function f: {1, 1}n! {1, 1}to any constant accuracy, under the uniform distribution, in time polynomial in n and in thedecision tree size of f. This is the first algorithm that can learn arbitrary monotone Booleanfunctions to high accuracy, using random examples only, in time polynomial in a reasonable measure of the complexity of f. A key ingredient of the result is a new bound showing that theaverage sensitivity of any monotone function computed by a decision tree of size s must be atmost plog s. This bound has already proved to be of independent utility in the study of decisiontree complexity [27]. We generalize the basic inequality and learning result described above in various ways; specifically, to partition size (a stronger complexity measure than decision tree size), pbiased measuresover the Boolean cube (rather than just the uniform distribution), and realvalued (rather than
Testing monotone highdimensional distributions
 In STOC
, 2005
"... A monotone distribution P over a (partially) ordered domain assigns higher probability to y than to x if y ≥ x in the order. We study several natural problems concerning testing properties of monotone distributions over the ndimensional Boolean cube, given access to random draws from the distributi ..."
Abstract

Cited by 21 (6 self)
 Add to MetaCart
A monotone distribution P over a (partially) ordered domain assigns higher probability to y than to x if y ≥ x in the order. We study several natural problems concerning testing properties of monotone distributions over the ndimensional Boolean cube, given access to random draws from the distribution being tested. We give a poly(n)time algorithm for testing whether a monotone distribution is equivalent to or ɛfar (in the L1 norm) from the uniform distribution. A key ingredient of the algorithm is a generalization of a known isoperimetric inequality for the Boolean cube. We also introduce a method for proving lower bounds on various problems of testing monotone distributions over the ndimensional Boolean cube, based on a new decomposition technique for monotone distributions. We use this method to show that our uniformity testing algorithm is optimal up to polylog(n) factors, and also to give exponential lower bounds on the complexity of several other problems, including testing whether a monotone distribution is identical to or ɛfar from a fixed known monotone product distribution and approximating the entropy of an unknown monotone distribution. 1
Learning DNF from Random Walks
 IN PROCEEDINGS OF FOCS
, 2003
"... We consider a model of learning Boolean functions from examples generated by a uniform random walk on {0, 1}^n. We give a polynomial time algorithm for learning decision trees and DNF formulas in this model. This is the first efficient algorithm for learning these classes in a natural passive learni ..."
Abstract

Cited by 17 (3 self)
 Add to MetaCart
We consider a model of learning Boolean functions from examples generated by a uniform random walk on {0, 1}^n. We give a polynomial time algorithm for learning decision trees and DNF formulas in this model. This is the first efficient algorithm for learning these classes in a natural passive learning model where the learner has no influence over the choice of examples used for learning.
Maximum margin algorithms with Boolean kernels
 In Proceedings of the Sixteenth Annual Conference on Computational Learning Theory
, 2003
"... Abstract. Recent work has introduced Boolean kernels with which one can learn over a feature space containing all conjunctions of length up to k (for any 1 ≤ k ≤ n) over the original n Boolean features in the input space. This motivates the question of whether maximum margin algorithms such as suppo ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
Abstract. Recent work has introduced Boolean kernels with which one can learn over a feature space containing all conjunctions of length up to k (for any 1 ≤ k ≤ n) over the original n Boolean features in the input space. This motivates the question of whether maximum margin algorithms such as support vector machines can learn Disjunctive Normal Form expressions in the PAC learning model using this kernel. We study this question, as well as a variant in which structural risk minimization (SRM) is performed where the class hierarchy is taken over the length of conjunctions. We show that such maximum margin algorithms do not PAC learn t(n)term DNF for any t(n) = ω(1), even when used with such a SRM scheme. We also consider PAC learning under the uniform distribution and show that if the kernel uses conjunctions of length ˜ω ( √ n) then the maximum margin hypothesis will fail on the uniform distribution as well. Our results concretely illustrate that margin based algorithms may overfit when learning simple target functions with natural kernels.
Optimal Cryptographic Hardness of Learning Monotone Functions
"... Abstract. A wide range of positive and negative results have been established for learning different classes of Boolean functions from uniformly distributed random examples. However, polynomialtime algorithms have thus far been obtained almost exclusively for various classes of monotone functions, ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
Abstract. A wide range of positive and negative results have been established for learning different classes of Boolean functions from uniformly distributed random examples. However, polynomialtime algorithms have thus far been obtained almost exclusively for various classes of monotone functions, while the computational hardness results obtained to date have all been for various classes of general (nonmonotone) functions. Motivated by this disparity between known positive results (for monotone functions) and negative results (for nonmonotone functions), we establish strong computational limitations on the efficient learnability of various classes of monotone functions. We give several such hardness results which are provably almost optimal since they nearly match known positive results. Some of our results show cryptographic hardness of learning polynomialsize monotone circuits to accuracy only slightly greater than 1/2 + 1 / √ n; this accuracy bound is close to optimal by known positive results (Blum et al., FOCS ’98). Other results show that under a plausible cryptographic hardness assumption, a class of constantdepth, subpolynomialsize circuits computing monotone functions is hard to learn; this result is close to optimal in terms of the circuit size parameter by known positive results as well (Servedio, Information and Computation ’04). Our main tool is a complexitytheoretic approach to hardness amplification via noise sensitivity of monotone functions that was pioneered by O’Donnell (JCSS ’04). 1
On learning random DNF formulas under the uniform distribution
 In Proc. 9th Internat. Workshop on Randomization and Computation (RANDOM’05
, 2005
"... Abstract: We study the averagecase learnability of DNF formulas in the model of learning from uniformly distributed random examples. We define a natural model of random monotone DNF formulas and give an efficient algorithm which with high probability can learn, for any fixed constant γ> 0, a random ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Abstract: We study the averagecase learnability of DNF formulas in the model of learning from uniformly distributed random examples. We define a natural model of random monotone DNF formulas and give an efficient algorithm which with high probability can learn, for any fixed constant γ> 0, a random tterm monotone DNF for any t = O(n 2−γ). We also define a model of random nonmonotone DNF and give an efficient algorithm which with high probability can learn a random tterm DNF for any t = O(n 3/2−γ). These are the first known algorithms that can learn a broad class of polynomialsize DNF in a reasonable averagecase model of learning from random examples. ACM Classification: I.2.6, F.2.2, G.1.2, G.3
Lower Bounds and Hardness Amplification for Learning Shallow Monotone Formulas
"... Much work has been done on learning various classes of “simple ” monotone functions under the uniform distribution. In this paper we give the first unconditional lower bounds for learning problems of this sort by showing that polynomialtime algorithms cannot learn shallow monotone Boolean formulas ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Much work has been done on learning various classes of “simple ” monotone functions under the uniform distribution. In this paper we give the first unconditional lower bounds for learning problems of this sort by showing that polynomialtime algorithms cannot learn shallow monotone Boolean formulas under the uniform distribution in the wellstudied Statistical Query (SQ) model. We introduce a new approach to understanding the learnability of “simple ” monotone functions that is based on a recent characterization of Strong SQ learnability by Simon (2007) Using the characterization we first show that depth3 monotone formulas of size n o(1) cannot be learned by any polynomialtime SQ algorithm to accuracy 1−1/(log n) Ω(1). We then build on this result to show that depth4 monotone formulas of size no(1) cannot be learned even to a certain 1 2 + o(1) accuracy in polynomial time. This improved hardness is achieved using a general technique that we introduce for amplifying the hardness of “mildly hard ” learning problems in either the PAC or SQ framework. This hardness amplification for learning builds on the ideas in the work of O’Donnell (2004) on hardness amplification for approximating functions using small circuits, and is applicable to a number of other contexts. Finally, we demonstrate that our approach can also be used to reduce the wellknown open problem of learning juntas to learning of depth3 monotone formulas.